6 Commits

Author SHA1 Message Date
3795d9e4ff fix(playlists): make playlist parsing robust against filename and formatting issues
All checks were successful
CI / test (push) Successful in 53s
- Use glob lookup to find playlist files even with trailing spaces in filenames
- Sanitize lines (strip whitespace) before JSON parsing to ignore trailing spaces/empty lines
- Handle JSONDecodeError gracefully to prevent 500 errors from corrupt entries
- Return empty list on FileNotFoundError in read_playlist instead of crashing
- Extract _find_playlist_path and _parse_playlist_lines helpers for reuse
2026-04-05 18:47:21 -05:00
3cf221a1ed minor fix 2026-04-05 18:32:29 -05:00
13a0e6ceed fix(hls): improve audio track selection and auto-detect "Original"
- Auto-select "Original" audio track by default in both native and Plyr HLS players
- Fix native HLS audio selector to use numeric indices instead of string matching
- Robustly detect "original" track by checking both `name` and `lang` attributes
- Fix audio track change handler to correctly switch between available tracks
2026-04-05 18:31:35 -05:00
e8e2aa93d6 fix(channel): fix shorts/streams pagination using continuation tokens
- Add continuation_token_cache to store ctokens between page requests
- Use cached ctoken for page 2+ instead of generating fresh tokens
- Switch shorts/streams to Next/Previous buttons (no page numbers)
- Show "N+ videos" indicator when more pages are available
- Fix UnboundLocalError when page_call was undefined for shorts/streams

The issue was that YouTube's InnerTube API requires continuation tokens
for pagination on shorts/streams tabs, but the code was generating a new
ctoken each time, always returning the same 30 videos.
2026-04-05 18:19:05 -05:00
8403e30b3a Many fixes to i18n 2026-04-05 17:43:01 -05:00
f0649be5de Add HLS support to multi-audio 2026-04-05 14:56:51 -05:00
30 changed files with 668 additions and 1115 deletions

381
README.md
View File

@@ -1,313 +1,180 @@
# yt-local
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
[![Python 3.7+](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/)
[![Tests](https://img.shields.io/badge/tests-passing-brightgreen.svg)](https://github.com/user234683/youtube-local)
Fork of [youtube-local](https://github.com/user234683/youtube-local)
A privacy-focused, browser-based YouTube client that routes requests through Tor for anonymous viewing—**without compromising on speed or features**.
yt-local is a browser-based client written in Python for watching YouTube anonymously and without the lag of the slow page used by YouTube. One of the primary features is that all requests are routed through Tor, except for the video file at googlevideo.com. This is analogous to what HookTube (defunct) and Invidious do, except that you do not have to trust a third-party to respect your privacy. The assumption here is that Google won't put the effort in to incorporate the video file requests into their tracking, as it's not worth pursuing the incredibly small number of users who care about privacy (Tor video routing is also provided as an option). Tor has high latency, so this will not be as fast network-wise as regular YouTube. However, using Tor is optional; when not routing through Tor, video pages may load faster than they do with YouTube's page depending on your browser.
[Features](#features) • [Install](#install) • [Usage](#usage) • [Screenshots](#screenshots)
---
> [!NOTE]
> How it works: yt-local mirrors YouTube's web requests (using the same Invidious/InnerTube endpoints as yt-dlp and Invidious) but strips JavaScript and serves a lightweight HTML frontend. No API keys needed.
## Overview
yt-local is a lightweight, self-hosted YouTube client written in Python that gives you:
- **Privacy-first**: All requests route through Tor by default (video optional), keeping you anonymous.
- **Fast page loads**: No lazy-loading, no layout reflows, instant comment rendering.
- **Full control**: Customize subtitles, related videos, comments, and playback speed.
- **High quality**: Supports all YouTube video qualities (144p2160p) via DASH muxing.
- **Zero ads**: Clean interface, no tracking, no sponsored content.
- **Self-hosted**: You control the instance—no third-party trust required.
## Features
| Category | Features |
|---------------|----------------------------------------------------------------------------------------|
| Core | Search, channels, playlists, watch pages, comments, subtitles (auto/manual) |
| Privacy | Optional Tor routing (including video), automatic circuit rotation on 429 errors |
| Local | Local playlists (durable against YouTube deletions), thumbnail caching |
| UI | 3 themes (Light/Gray/Dark), theater mode, custom font selection |
| Config | Fine-grained settings: subtitle mode, comment visibility, sponsorblock integration |
| Performance | No JavaScript required, instant page rendering, rate limiting with exponential backoff |
| Subscriptions | Import from YouTube Takeout (CSV/JSON), tag organization, mute channels |
### Advanced Capabilities
- SponsorBlock integration — skip sponsored segments automatically
- Custom video speeds — 0.25x to 4x playback rate
- Video transcripts — accessible via transcript button
- Video quality muxing — combine separate video/audio streams for non-360p/720p resolutions
- Tor circuit rotation — automatic new identity on rate limiting (429)
- File downloading — download videos/audio (disabled by default, configurable)
The YouTube API is not used, so no keys or anything are needed. It uses the same requests as the YouTube webpage.
## Screenshots
| Light Theme | Gray Theme | Dark Theme |
|:-----------------------------------------------------:|:----------------------------------------------------:|:----------------------------------------------------:|
| ![Light](https://pic.infini.fr/l7WINjzS/0Ru6MrhA.png) | ![Gray](https://pic.infini.fr/znnQXWNc/hL78CRzo.png) | ![Dark](https://pic.infini.fr/iXwFtTWv/mt2kS5bv.png) |
[Light theme](https://pic.infini.fr/l7WINjzS/0Ru6MrhA.png)
| Channel View | Playlist View |
|:-------------------------------------------------------:|:---------------------:|
| ![Channel](https://pic.infini.fr/JsenWVYe/SbdIQlS6.png) | *(similar structure)* |
[Gray theme](https://pic.infini.fr/znnQXWNc/hL78CRzo.png)
---
[Dark theme](https://pic.infini.fr/iXwFtTWv/mt2kS5bv.png)
## Install
[Channel](https://pic.infini.fr/JsenWVYe/SbdIQlS6.png)
## Features
* Standard pages of YouTube: search, channels, playlists
* Anonymity from Google's tracking by routing requests through Tor
* Local playlists: These solve the two problems with creating playlists on YouTube: (1) they're datamined and (2) videos frequently get deleted by YouTube and lost from the playlist, making it very difficult to find a reupload as the title of the deleted video is not displayed.
* Themes: Light, Gray, and Dark
* Subtitles
* Easily download videos or their audio. (Disabled by default)
* No ads
* View comments
* JavaScript not required
* Theater and non-theater mode
* Subscriptions that are independent from YouTube
* Can import subscriptions from YouTube
* Works by checking channels individually
* Can be set to automatically check channels.
* For efficiency of requests, frequency of checking is based on how quickly channel posts videos
* Can mute channels, so as to have a way to "soft" unsubscribe. Muted channels won't be checked automatically or when using the "Check all" button. Videos from these channels will be hidden.
* Can tag subscriptions to organize them or check specific tags
* Fast page
* No distracting/slow layout rearrangement
* No lazy-loading of comments; they are ready instantly.
* Settings allow fine-tuned control over when/how comments or related videos are shown:
1. Shown by default, with click to hide
2. Hidden by default, with click to show
3. Never shown
* Optionally skip sponsored segments using [SponsorBlock](https://github.com/ajayyy/SponsorBlock)'s API
* Custom video speeds
* Video transcript
* Supports all available video qualities: 144p through 2160p
## Planned features
- [ ] Putting videos from subscriptions or local playlists into the related videos
- [x] Information about video (geographic regions, region of Tor exit node, etc)
- [ ] Ability to delete playlists
- [ ] Auto-saving of local playlist videos
- [ ] Import youtube playlist into a local playlist
- [ ] Rearrange items of local playlist
- [x] Video qualities other than 360p and 720p by muxing video and audio
- [x] Indicate if comments are disabled
- [x] Indicate how many comments a video has
- [ ] Featured channels page
- [ ] Channel comments
- [x] Video transcript
- [x] Automatic Tor circuit change when blocked
- [x] Support &t parameter
- [ ] Subscriptions: Option to mark what has been watched
- [ ] Subscriptions: Option to filter videos based on keywords in title or description
- [ ] Subscriptions: Delete old entries and thumbnails
- [ ] Support for more sites, such as Vimeo, Dailymotion, LBRY, etc.
## Installing
### Windows
1. Download the latest [release ZIP](https://github.com/user234683/yt-local/releases)
2. Extract to any folder
3. Run `run.bat` to start
Download the zip file under the Releases page. Unzip it anywhere you choose.
### GNU/Linux / macOS
### GNU+Linux/MacOS
```bash
# 1. Clone or extract the release
git clone https://github.com/user234683/yt-local.git
cd yt-local
Download the tarball under the Releases page and extract it. `cd` into the directory and run
# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
1. `cd yt-local`
2. `virtualenv -p python3 venv`
3. `source venv/bin/activate`
4. `pip install -r requirements.txt`
5. `python server.py`
# 3. Install dependencies
pip install -r requirements.txt
# 4. Run the server
python3 server.py
```
> [!TIP]
> If `pip` isn't installed, use your distro's package manager (e.g., `sudo apt install python3-pip` on Debian/Ubuntu).
### Portable Mode
To keep settings and data in the same directory as the app:
```bash
# Create an empty settings.txt in the project root
touch settings.txt
python3 server.py
# Data now stored in ./data/ instead of ~/.yt-local/
```
---
**Note**: If pip isn't installed, first try installing it from your package manager. Make sure you install pip for python 3. For example, the package you need on debian is python3-pip rather than python-pip. If your package manager doesn't provide it, try to install it according to [this answer](https://unix.stackexchange.com/a/182467), but make sure you run `python3 get-pip.py` instead of `python get-pip.py`
## Usage
### Basic Access
Firstly, if you wish to run this in portable mode, create the empty file "settings.txt" in the program's main directory. If the file is there, settings and data will be stored in the same directory as the program. Otherwise, settings and data will be stored in `C:\Users\[your username]\.yt-local` on Windows and `~/.yt-local` on GNU+Linux/MacOS.
1. Start the server:
To run the program on windows, open `run.bat`. On GNU+Linux/MacOS, run `python3 server.py`.
```bash
python3 server.py
# Server runs on http://127.0.0.1:9010 (configurable in /settings)
```
Access youtube URLs by prefixing them with `http://localhost:9010/`.
For instance, `http://localhost:9010/https://www.youtube.com/watch?v=vBgulDeV2RU`
You can use an addon such as Redirector ([Firefox](https://addons.mozilla.org/en-US/firefox/addon/redirector/)|[Chrome](https://chrome.google.com/webstore/detail/redirector/ocgpenflpmgnfapjedencafcfakcekcd)) to automatically redirect YouTube URLs to yt-local. I use the include pattern `^(https?://(?:[a-zA-Z0-9_-]*\.)?(?:youtube\.com|youtu\.be|youtube-nocookie\.com)/.*)` and redirect pattern `http://localhost:9010/$1` (Make sure you're using regular expression mode).
2. Access YouTube via proxy:
If you want embeds on web to also redirect to yt-local, make sure "Iframes" is checked under advanced options in your redirector rule. Check test `http://localhost:9010/youtube.com/embed/vBgulDeV2RU`
```bash
http://localhost:9010/https://www.youtube.com/watch?v=vBgulDeV2RU
```
yt-local can be added as a search engine in firefox to make searching more convenient. See [here](https://support.mozilla.org/en-US/kb/add-or-remove-search-engine-firefox) for information on firefox search plugins.
All YouTube URLs must be prefixed with `http://localhost:9010/https://`.
### Using Tor
3. (Optional) Use Redirector to auto-redirect YouTube URLs:
In the settings page, set "Route Tor" to "On, except video" (the second option). Be sure to save the settings.
- **Firefox**: [Redirector addon](https://addons.mozilla.org/firefox/addon/redirector/)
- **Chrome**: [Redirector addon](https://chrome.google.com/webstore/detail/redirector/ocgpenflpmgnfapjedencafcfakcekcd)
- **Pattern**: `^(https?://(?:[a-zA-Z0-9_-]*\.)?(?:youtube\.com|youtu\.be|youtube-nocookie\.com)/.*)`
- **Redirect to**: `http://localhost:9010/$1`
Ensure Tor is listening for Socks5 connections on port 9150. A simple way to accomplish this is by opening the Tor Browser Bundle and leaving it open. However, you will not be accessing the program (at https://localhost:8080) through the Tor Browser. You will use your regular browser for that. Rather, this is just a quick way to give the program access to Tor routing.
> [!NOTE]
> To use embeds on web pages, make sure "Iframes" is checked under advanced options in your redirector rule.
### Standalone Tor
### Tor Routing
If you don't want to waste system resources leaving the Tor Browser open in addition to your regular browser, you can configure standalone Tor to run instead using the following instructions.
> [!IMPORTANT]
> Recommended for privacy. In `/settings`, set **Route Tor** to `"On, except video"` (or `"On, including video"`), then save.
For Windows, to make standalone Tor run at startup, press Windows Key + R and type `shell:startup` to open the Startup folder. Create a new shortcut there. For the command of the shortcut, enter `"C:\[path-to-Tor-Browser-directory]\Tor\tor.exe" SOCKSPort 9150 ControlPort 9151`. You can then launch this shortcut to start it. Alternatively, if something isn't working, to see what's wrong, open `cmd.exe` and go to the directory `C:\[path-to-Tor-Browser-directory]\Tor`. Then run `tor SOCKSPort 9150 ControlPort 9151 | more`. The `more` part at the end is just to make sure any errors are displayed, to fix a bug in Windows cmd where tor doesn't display any output. You can stop tor in the task manager.
#### Running Tor
For Debian/Ubuntu, you can `sudo apt install tor` to install the command line version of Tor, and then run `sudo systemctl start tor` to run it as a background service that will get started during boot as well. However, Tor on the command line uses the port `9050` by default (rather than the 9150 used by the Tor Browser). So you will need to change `Tor port` to 9050 and `Tor control port` to `9051` in yt-local settings page. Additionally, you will need to enable the Tor control port by uncommenting the line `ControlPort 9051`, and setting `CookieAuthentication` to 0 in `/etc/tor/torrc`. If no Tor package is available for your distro, you can configure the `tor` binary located at `./Browser/TorBrowser/Tor/tor` inside the Tor Browser installation location to run at start time, or create a service to do it.
Option A: Tor Browser (easiest)
### Tor video routing
- Launch Tor Browser and leave it running
- yt-local uses port `9150` (Tor Browser default)
If you wish to route the video through Tor, set "Route Tor" to "On, including video". Because this is bandwidth-intensive, you are strongly encouraged to donate to the [consortium of Tor node operators](https://torservers.net/donate.html). For instance, donations to [NoiseTor](https://noisetor.net/) go straight towards funding nodes. Using their numbers for bandwidth costs, together with an average of 485 kbit/sec for a diverse sample of videos, and assuming n hours of video watched per day, gives $0.03n/month. A $1/month donation will be a very generous amount to not only offset losses, but help keep the network healthy.
Option B: Standalone Tor
In general, Tor video routing will be slower (for instance, moving around in the video is quite slow). I've never seen any signs that watch history in yt-local affects on-site Youtube recommendations. It's likely that requests to googlevideo are logged for some period of time, but are not integrated into Youtube's larger advertisement/recommendation systems, since those presumably depend more heavily on in-page tracking through Javascript rather than CDN requests to googlevideo.
```bash
# Linux (Debian/Ubuntu)
sudo apt install tor
sudo systemctl enable --now tor
### Importing subscriptions
# Configure yt-local ports (if using default Tor ports):
# Tor port: 9150
# Tor control port: 9151
```
> [!WARNING]
> Video over Tor is bandwidth-intensive. Consider donating to [Tor node operators](https://torservers.net/donate.html) to sustain the network.
### Import Subscriptions
1. Go to [Google Takeout](https://takeout.google.com/takeout/custom/youtube)
2. Deselect all → select only **Subscriptions** → create export
3. Download and extract `subscriptions.csv` (path: `YouTube and YouTube Music/subscriptions/subscriptions.csv`)
4. In yt-local: **Subscriptions****Import** → upload CSV
> [!IMPORTANT]
> The CSV file must contain columns: `channel_id,channel_name,channel_url`
## Supported formats
1. Go to the [Google takeout manager](https://takeout.google.com/takeout/custom/youtube).
2. Log in if asked.
3. Click on "All data included", then on "Deselect all", then select only "subscriptions" and click "OK".
4. Click on "Next step" and then on "Create export".
5. Click on the "Download" button after it appears.
6. From the downloaded takeout zip extract the .csv file. It is usually located under `YouTube and YouTube Music/subscriptions/subscriptions.csv`
7. Go to the subscriptions manager in yt-local. In the import area, select your .csv file, then press import.
Supported subscriptions import formats:
- NewPipe subscriptions export JSON
- Google Takeout CSV
- Google Takeout JSON (legacy)
- NewPipe JSON export
- OPML (from YouTube's old subscription manager)
- Old Google Takeout JSON
- OPML format from now-removed YouTube subscriptions manager
---
## Contributing
## Configuration
Pull requests and issues are welcome
Visit `http://localhost:9010/settings` to configure:
For coding guidelines and an overview of the software architecture, see the [HACKING.md](docs/HACKING.md) file.
| Setting | Description |
|--------------------|-------------------------------------------------|
| Route Tor | Off / On (except video) / On (including video) |
| Default subtitles | Off / Manual only / Auto + Manual |
| Comments mode | Shown by default / Hidden by default / Never |
| Related videos | Same options as comments |
| Theme | Light / Gray / Dark |
| Font | Browser default / Serif / Sans-serif |
| Default resolution | Auto / 144p2160p |
| SponsorBlock | Enable Sponsored segments skipping |
| Proxy images | Route thumbnails through yt-local (for privacy) |
---
## Troubleshooting
| Issue | Solution |
|------------------------------|----------------------------------------------------------------------------------------------|
| Port already in use | Change `port_number` in `/settings` or kill existing process: `pkill -f "python3 server.py"` |
| 429 Too Many Requests | Enable Tor routing for automatic IP rotation, or wait 5-10 minutes |
| Failed to connect to Tor | Verify Tor is running: `tor --version` or launch Tor Browser |
| Subscriptions not importing | Ensure CSV has columns: `channel_id,channel_name,channel_url` |
| Settings persist across runs | Check `~/.yt-local/settings.txt` (non-portable) or `./settings.txt` (portable) |
---
## Development
### Running Tests
## GPG public KEY
```bash
source venv/bin/activate # if not already in venv
make test
72CFB264DFC43F63E098F926E607CE7149F4D71C
```
### Project Structure
## Public instances
```bash
yt-local/
├── youtube/ # Core application logic
│ ├── __init__.py # Flask app entry point
│ ├── util.py # HTTP utilities, Tor manager, fetch_url
│ ├── watch.py # Video/playlist page handlers
│ ├── channel.py # Channel page handlers
│ ├── playlist.py # Playlist handlers
│ ├── search.py # Search handlers
│ ├── comments.py # Comment extraction/rendering
│ ├── subscriptions.py # Subscription management + SQLite
│ ├── local_playlist.py # Local playlist CRUD
│ ├── proto.py # YouTube protobuf token generation
│ ├── yt_data_extract/ # Polymer JSON parsing abstractions
│ └── hls_cache.py # HLS audio/video streaming proxy
├── templates/ # Jinja2 HTML templates
├── static/ # CSS/JS assets
├── translations/ # i18n files (Babel)
├── tests/ # pytest test suite
├── server.py # WSGI entry point
├── settings.py # Settings parser + admin page
├── generate_release.py # Windows release builder
└── manage_translations.py # i18n maintenance script
```
yt-local is not made to work in public mode, however there is an instance of yt-local in public mode but with less features
> [!NOTE]
> For detailed architecture guidance, see [`docs/HACKING.md`](docs/HACKING.md).
### Contributing
Contributions welcome! Please:
1. Read [`docs/HACKING.md`](docs/HACKING.md) for coding guidelines
2. Follow [PEP 8](https://peps.python.org/pep-0008/) style (use `ruff format`)
3. Run tests before submitting: `pytest`
4. Ensure no security issues: `bandit -r .`
5. Update docs for new features
---
## Security Notes
- **No API keys required** — uses same endpoints as public YouTube web interface
- **Tor is optional** — disable in `/settings` if you prefer performance over anonymity
- **Rate limiting handled** — exponential backoff (max 5 retries) with automatic Tor circuit rotation
- **Path traversal protected** — user input validated against regex whitelists (CWE-22)
- **Subprocess calls secure** — build scripts use `subprocess.run([...])` instead of shell (CWE-78)
> [!NOTE]
> GPG key for release verification: `72CFB264DFC43F63E098F926E607CE7149F4D71C`
---
## Public Instances
yt-local is designed for self-hosting.
---
## Donate
This project is 100% free and open-source. If you'd like to support development:
- **Bitcoin**: `1JrC3iqs3PP5Ge1m1vu7WE8LEf4S85eo7y`
- **Tor node donation**: https://torservers.net/donate
---
- <https://m.fridu.us/https://youtube.com>
## License
GNU Affero General Public License v3.0+
This project is licensed under the GNU Affero General Public License v3 (GNU AGPLv3) or any later version.
See [`LICENSE`](LICENSE) for full text.
Permission is hereby granted to the youtube-dl project at [https://github.com/ytdl-org/youtube-dl](https://github.com/ytdl-org/youtube-dl) to relicense any portion of this software under the Unlicense, public domain, or whichever license is in use by youtube-dl at the time of relicensing, for the purpose of inclusion of said portion into youtube-dl. Relicensing permission is not granted for any purpose outside of direct inclusion into the [official repository](https://github.com/ytdl-org/youtube-dl) of youtube-dl. If inclusion happens during the process of a pull-request, relicensing happens at the moment the pull request is merged into youtube-dl; until that moment, any cloned repositories of youtube-dl which make use of this software are subject to the terms of the GNU AGPLv3.
### Exception for youtube-dl
## Donate
This project is completely free/Libre and will always be.
Permission is granted to relicense code portions into youtube-dl's license (currently GPL) for direct inclusion into the [official youtube-dl repository](https://github.com/ytdl-org/youtube-dl). This exception **does not apply** to forks or other uses—those remain under AGPLv3.
#### Crypto:
- **Bitcoin**: `1JrC3iqs3PP5Ge1m1vu7WE8LEf4S85eo7y`
---
## Similar Projects
| Project | Type | Notes |
|--------------------------------------------------------------|----------|--------------------------------------|
| [invidious](https://github.com/iv-org/invidious) | Server | Multi-user instance, REST API |
| [Yotter](https://github.com/ytorg/Yotter) | Server | YouTube + Twitter integration |
| [FreeTube](https://github.com/FreeTubeApp/FreeTube) | Desktop | Electron-based client |
| [NewPipe](https://newpipe.schabi.org/) | Mobile | Android-only, no JavaScript |
| [mps-youtube](https://github.com/mps-youtube/mps-youtube) | Terminal | CLI-based, text UI |
| [youtube-local](https://github.com/user234683/youtube-local) | Browser | Original project (base for yt-local) |
---
Made for privacy-conscious users
Last updated: 2026-04-19
## Similar projects
- [invidious](https://github.com/iv-org/invidious) Similar to this project, but also allows it to be hosted as a server to serve many users
- [Yotter](https://github.com/ytorg/Yotter) Similar to this project and to invidious. Also supports Twitter
- [FreeTube](https://github.com/FreeTubeApp/FreeTube) (Similar to this project, but is an electron app outside the browser)
- [youtube-local](https://github.com/user234683/youtube-local) first project on which yt-local is based
- [NewPipe](https://newpipe.schabi.org/) (app for android)
- [mps-youtube](https://github.com/mps-youtube/mps-youtube) (terminal-only program)
- [youtube-viewer](https://github.com/trizen/youtube-viewer)
- [smtube](https://www.smtube.org/)
- [Minitube](https://flavio.tordini.org/minitube), [github here](https://github.com/flaviotordini/minitube)
- [toogles](https://github.com/mikecrittenden/toogles) (only embeds videos, doesn't use mp4)
- [YTLibre](https://git.sr.ht/~heckyel/ytlibre) only extract video
- [youtube-dl](https://rg3.github.io/youtube-dl/), which this project was based off

View File

@@ -1,108 +1,76 @@
# Basic init yt-local for openrc
## Basic init yt-local for openrc
## Prerequisites
1. Write `/etc/init.d/ytlocal` file.
- System with OpenRC installed and configured.
- Administrative privileges (doas or sudo).
- `ytlocal` script located at `/usr/sbin/ytlocal` and application files in an accessible directory.
```
#!/sbin/openrc-run
# Distributed under the terms of the GNU General Public License v3 or later
name="yt-local"
pidfile="/var/run/ytlocal.pid"
command="/usr/sbin/ytlocal"
## Service Installation
depend() {
use net
}
1. **Create the OpenRC service script** `/etc/init.d/ytlocal`:
start_pre() {
if [ ! -f /usr/sbin/ytlocal ] ; then
eerror "Please create script file of ytlocal in '/usr/sbin/ytlocal'"
return 1
else
return 0
fi
}
```sh
#!/sbin/openrc-run
# Distributed under the terms of the GNU General Public License v3 or later
name="yt-local"
pidfile="/var/run/ytlocal.pid"
command="/usr/sbin/ytlocal"
start() {
ebegin "Starting yt-local"
start-stop-daemon --start --exec "${command}" --pidfile "${pidfile}"
eend $?
}
depend() {
use net
}
reload() {
ebegin "Reloading ${name}"
start-stop-daemon --signal HUP --pidfile "${pidfile}"
eend $?
}
start_pre() {
if [ ! -f /usr/sbin/ytlocal ]; then
eerror "Please create script file of ytlocal in '/usr/sbin/ytlocal'"
return 1
else
return 0
fi
}
stop() {
ebegin "Stopping ${name}"
start-stop-daemon --quiet --stop --exec "${command}" --pidfile "${pidfile}"
eend $?
}
```
start() {
ebegin "Starting yt-local"
start-stop-daemon --start --exec "${command}" --pidfile "${pidfile}"
eend $?
}
after, modified execute permissions:
reload() {
ebegin "Reloading ${name}"
start-stop-daemon --signal HUP --pidfile "${pidfile}"
eend $?
}
$ doas chmod a+x /etc/init.d/ytlocal
stop() {
ebegin "Stopping ${name}"
start-stop-daemon --quiet --stop --exec "${command}" --pidfile "${pidfile}"
eend $?
}
```
> [!NOTE]
> Ensure the script is executable:
>
> ```sh
> doas chmod a+x /etc/init.d/ytlocal
> ```
2. Write `/usr/sbin/ytlocal` and configure path.
2. **Create the executable script** `/usr/sbin/ytlocal`:
```
#!/usr/bin/env bash
```bash
#!/usr/bin/env bash
cd /home/your-path/ytlocal/ # change me
source venv/bin/activate
python server.py > /dev/null 2>&1 &
echo $! > /var/run/ytlocal.pid
```
# Change the working directory according to your installation path
# Example: if installed in /usr/local/ytlocal, use:
cd /home/your-path/ytlocal/ # <-- MODIFY TO YOUR PATH
source venv/bin/activate
python server.py > /dev/null 2>&1 &
echo $! > /var/run/ytlocal.pid
```
after, modified execute permissions:
> [!WARNING]
> Run this script only as root or via `doas`, as it writes to `/var/run` and uses network privileges.
$ doas chmod a+x /usr/sbin/ytlocal
> [!TIP]
> To store the PID in a different location, adjust the `pidfile` variable in the service script.
> [!IMPORTANT]
> Verify that the virtual environment (`venv`) is correctly set up and that `python` points to the appropriate version.
3. OpenRC check
> [!CAUTION]
> Do not stop the process manually; use OpenRC commands (`rc-service ytlocal stop`) to avoid race conditions.
- status: `doas rc-service ytlocal status`
- start: `doas rc-service ytlocal start`
- restart: `doas rc-service ytlocal restart`
- stop: `doas rc-service ytlocal stop`
> [!NOTE]
> When run with administrative privileges, the configuration is saved in `/root/.yt-local`, which is rootonly.
- enable: `doas rc-update add ytlocal default`
- disable: `doas rc-update del ytlocal`
## Service Management
- **Status**: `doas rc-service ytlocal status`
- **Start**: `doas rc-service ytlocal start`
- **Restart**: `doas rc-service ytlocal restart`
- **Stop**: `doas rc-service ytlocal stop`
- **Enable at boot**: `doas rc-update add ytlocal default`
- **Disable**: `doas rc-update del ytlocal`
## PostInstallation Verification
- Confirm the process is running: `doas rc-service ytlocal status`
- Inspect logs for issues: `doas tail -f /var/log/ytlocal.log` (if logging is configured).
## Troubleshooting Common Issues
- **Service fails to start**: verify script permissions, correct `command=` path, and that the virtualenv exists.
- **Port conflict**: adjust the servers port configuration before launching.
- **Import errors**: ensure all dependencies are installed in the virtual environment.
[!IMPORTANT]
Keep the service script updated when modifying startup logic or adding new dependencies.
When yt-local is run with administrator privileges,
the configuration file is stored in /root/.yt-local

View File

@@ -33,7 +33,7 @@ def check_subp(x):
raise Exception('Got nonzero exit code from command')
def log(line):
print(f'[generate_release.py] {line}')
print('[generate_release.py] ' + line)
# https://stackoverflow.com/questions/7833715/python-deleting-certain-file-extensions
def remove_files_with_extensions(path, extensions):
@@ -43,33 +43,27 @@ def remove_files_with_extensions(path, extensions):
os.remove(os.path.join(root, file))
def download_if_not_exists(file_name, url, sha256=None):
if not os.path.exists(f'./{file_name}'):
# Reject non-https URLs so a mistaken constant cannot cause a
# plaintext download (bandit B310 hardening).
if not url.startswith('https://'):
raise Exception(f'Refusing to download over non-https URL: {url}')
log(f'Downloading {file_name}..')
if not os.path.exists('./' + file_name):
log('Downloading ' + file_name + '..')
data = urllib.request.urlopen(url).read()
log(f'Finished downloading {file_name}')
with open(f'./{file_name}', 'wb') as f:
log('Finished downloading ' + file_name)
with open('./' + file_name, 'wb') as f:
f.write(data)
if sha256:
digest = hashlib.sha256(data).hexdigest()
if digest != sha256:
log(f'Error: {file_name} has wrong hash: {digest}')
log('Error: ' + file_name + ' has wrong hash: ' + digest)
sys.exit(1)
else:
log(f'Using existing {file_name}')
log('Using existing ' + file_name)
def wine_run_shell(command):
# Keep argv-style invocation (no shell) to avoid command injection.
if os.name == 'posix':
parts = ['wine'] + command.replace('\\', '/').split()
check(os.system('wine ' + command.replace('\\', '/')))
elif os.name == 'nt':
parts = command.split()
check(os.system(command))
else:
raise Exception('Unsupported OS')
check(subprocess.run(parts).returncode)
def wine_run(command_parts):
if os.name == 'posix':
@@ -98,20 +92,7 @@ if os.path.exists('./yt-local'):
# confused with working directory. I'm calling it the same thing so it will
# have that name when extracted from the final release zip archive)
log('Making copy of yt-local files')
# Avoid the shell: pipe `git archive` into 7z directly via subprocess.
_git_archive = subprocess.Popen(
['git', 'archive', '--format', 'tar', 'master'],
stdout=subprocess.PIPE,
)
_sevenz = subprocess.Popen(
['7z', 'x', '-si', '-ttar', '-oyt-local'],
stdin=_git_archive.stdout,
)
_git_archive.stdout.close()
_sevenz.wait()
_git_archive.wait()
check(_sevenz.returncode)
check(_git_archive.returncode)
check(os.system('git archive --format tar master | 7z x -si -ttar -oyt-local'))
if len(os.listdir('./yt-local')) == 0:
raise Exception('Failed to copy yt-local files')
@@ -120,7 +101,7 @@ if len(os.listdir('./yt-local')) == 0:
# ----------- Generate embedded python distribution -----------
os.environ['PYTHONDONTWRITEBYTECODE'] = '1' # *.pyc files double the size of the distribution
get_pip_url = 'https://bootstrap.pypa.io/get-pip.py'
latest_dist_url = f'https://www.python.org/ftp/python/{latest_version}/python-{latest_version}'
latest_dist_url = 'https://www.python.org/ftp/python/' + latest_version + '/python-' + latest_version
if bitness == '32':
latest_dist_url += '-embed-win32.zip'
else:
@@ -142,7 +123,7 @@ else:
download_if_not_exists('get-pip.py', get_pip_url)
python_dist_name = f'python-dist-{latest_version}-{bitness}.zip'
python_dist_name = 'python-dist-' + latest_version + '-' + bitness + '.zip'
download_if_not_exists(python_dist_name, latest_dist_url)
download_if_not_exists(visual_c_name,
@@ -155,7 +136,7 @@ if os.path.exists('./python'):
log('Extracting python distribution')
check_subp(subprocess.run(['7z', '-y', 'x', '-opython', python_dist_name]))
check(os.system(r'7z -y x -opython ' + python_dist_name))
log('Executing get-pip.py')
wine_run(['./python/python.exe', '-I', 'get-pip.py'])
@@ -203,7 +184,7 @@ and replaced with a .pth. Isolated mode will have to be specified manually.
log('Removing ._pth')
major_release = latest_version.split('.')[1]
os.remove(rf'./python/python3{major_release}._pth')
os.remove(r'./python/python3' + major_release + '._pth')
log('Adding path_fixes.pth')
with open(r'./python/path_fixes.pth', 'w', encoding='utf-8') as f:
@@ -214,7 +195,7 @@ with open(r'./python/path_fixes.pth', 'w', encoding='utf-8') as f:
# Need to add the directory where packages are installed,
# and the parent directory (which is where the yt-local files are)
major_release = latest_version.split('.')[1]
with open(rf'./python/python3{major_release}._pth', 'a', encoding='utf-8') as f:
with open('./python/python3' + major_release + '._pth', 'a', encoding='utf-8') as f:
f.write('.\\Lib\\site-packages\n')
f.write('..\n')'''
@@ -255,12 +236,12 @@ log('Copying python distribution into release folder')
shutil.copytree(r'./python', r'./yt-local/python')
# ----------- Create release zip -----------
output_filename = f'yt-local-{release_tag}-{suffix}.zip'
if os.path.exists(f'./{output_filename}'):
output_filename = 'yt-local-' + release_tag + '-' + suffix + '.zip'
if os.path.exists('./' + output_filename):
log('Removing previous zipped release')
os.remove(f'./{output_filename}')
os.remove('./' + output_filename)
log('Zipping release')
check_subp(subprocess.run(['7z', '-mx=9', 'a', output_filename, './yt-local']))
check(os.system(r'7z -mx=9 a ' + output_filename + ' ./yt-local'))
print('\n')
log('Finished')

View File

@@ -1,28 +1,22 @@
#!/usr/bin/env python3
# E402 is deliberately ignored in this file: `monkey.patch_all()` must run
# before any stdlib networking or gevent-dependent modules are imported.
from gevent import monkey
monkey.patch_all()
import gevent.socket
from youtube import yt_app
from youtube import util
# these are just so the files get run - they import yt_app and add routes to it
from youtube import (
watch,
search,
playlist,
channel,
local_playlist,
comments,
subscriptions,
)
from youtube import watch, search, playlist, channel, local_playlist, comments, subscriptions
import settings
from gevent.pywsgi import WSGIServer
import urllib
import urllib3
import socket
import socks, sockshandler
import subprocess
import re
import sys
import time
@@ -32,9 +26,9 @@ def youtu_be(env, start_response):
id = env['PATH_INFO'][1:]
env['PATH_INFO'] = '/watch'
if not env['QUERY_STRING']:
env['QUERY_STRING'] = f'v={id}'
env['QUERY_STRING'] = 'v=' + id
else:
env['QUERY_STRING'] += f'&v={id}'
env['QUERY_STRING'] += '&v=' + id
yield from yt_app(env, start_response)
@@ -61,15 +55,17 @@ def proxy_site(env, start_response, video=False):
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)',
'Accept': '*/*',
}
current_range_start = 0
range_end = None
if 'HTTP_RANGE' in env:
send_headers['Range'] = env['HTTP_RANGE']
url = f"https://{env['SERVER_NAME']}{env['PATH_INFO']}"
url = "https://" + env['SERVER_NAME'] + env['PATH_INFO']
# remove /name portion
if video and '/videoplayback/name/' in url:
url = url[0:url.rfind('/name/')]
if env['QUERY_STRING']:
url += f'?{env["QUERY_STRING"]}'
url += '?' + env['QUERY_STRING']
try_num = 1
first_attempt = True
@@ -96,7 +92,7 @@ def proxy_site(env, start_response, video=False):
+[('Access-Control-Allow-Origin', '*')])
if first_attempt:
start_response(f"{response.status} {response.reason}",
start_response(str(response.status) + ' ' + response.reason,
response_headers)
content_length = int(dict(response_headers).get('Content-Length', 0))
@@ -136,8 +132,9 @@ def proxy_site(env, start_response, video=False):
fail_byte = start + total_received
send_headers['Range'] = 'bytes=%d-%d' % (fail_byte, end)
print(
f'Warning: YouTube closed the connection before byte {fail_byte}. '
f'Expected {start+content_length} bytes.'
'Warning: YouTube closed the connection before byte',
str(fail_byte) + '.', 'Expected', start+content_length,
'bytes.'
)
retry = True
@@ -184,7 +181,7 @@ def split_url(url):
# python STILL doesn't have a proper regular expression engine like grep uses built in...
match = re.match(r'(?:https?://)?([\w-]+(?:\.[\w-]+)+?)(/.*|$)', url)
if match is None:
raise ValueError(f'Invalid or unsupported url: {url}')
raise ValueError('Invalid or unsupported url: ' + url)
return match.group(1), match.group(2)
@@ -237,7 +234,7 @@ def site_dispatch(env, start_response):
if base_name == '':
base_name = domain
else:
base_name = f"{domain}.{base_name}"
base_name = domain + '.' + base_name
try:
handler = site_handlers[base_name]
@@ -277,8 +274,6 @@ class FilteredRequestLog:
if __name__ == '__main__':
if settings.allow_foreign_addresses:
# Binding to all interfaces is opt-in via the
# `allow_foreign_addresses` setting and documented as discouraged.
server = WSGIServer(('0.0.0.0', settings.port_number), site_dispatch,
log=FilteredRequestLog())
ip_server = '0.0.0.0'

View File

@@ -261,20 +261,10 @@ For security reasons, enabling this is not recommended.''',
'category': 'interface',
}),
('native_player_storyboard', {
'type': bool,
'default': False,
'label': 'Storyboard preview (native)',
'comment': '''Show thumbnail preview on hover (native player modes).
Positioning is heuristic; may misalign in Firefox/Safari.
Works best on Chromium browsers.
No effect in Plyr.''',
'category': 'interface',
}),
('use_video_download', {
'type': int,
'default': 0,
'comment': '',
'options': [
(0, 'Disabled'),
(1, 'Enabled'),
@@ -397,14 +387,14 @@ acceptable_targets = SETTINGS_INFO.keys() | {
def comment_string(comment):
result = ''
for line in comment.splitlines():
result += f'# {line}\n'
result += '# ' + line + '\n'
return result
def save_settings(settings_dict):
with open(settings_file_path, 'w', encoding='utf-8') as file:
for setting_name, setting_info in SETTINGS_INFO.items():
file.write(f"{comment_string(setting_info['comment'])}{setting_name} = {repr(settings_dict[setting_name])}\n\n")
file.write(comment_string(setting_info['comment']) + setting_name + ' = ' + repr(settings_dict[setting_name]) + '\n\n')
def add_missing_settings(settings_dict):
@@ -481,7 +471,7 @@ upgrade_functions = {
def log_ignored_line(line_number, message):
print(f'WARNING: Ignoring settings.txt line {line_number} ({message})')
print("WARNING: Ignoring settings.txt line " + str(node.lineno) + " (" + message + ")")
if os.path.isfile("settings.txt"):
@@ -509,33 +499,29 @@ else:
else:
# parse settings in a safe way, without exec
current_settings_dict = {}
# Python 3.8+ uses ast.Constant; older versions use ast.Num, ast.Str, ast.NameConstant
attributes = {
ast.Constant: 'value',
ast.NameConstant: 'value',
ast.Num: 'n',
ast.Str: 's',
}
try:
attributes[ast.Num] = 'n'
attributes[ast.Str] = 's'
attributes[ast.NameConstant] = 'value'
except AttributeError:
pass # Removed in Python 3.12+
module_node = ast.parse(settings_text)
for node in module_node.body:
if not isinstance(node, ast.Assign):
log_ignored_line(node.lineno, 'only assignments are allowed')
if type(node) != ast.Assign:
log_ignored_line(node.lineno, "only assignments are allowed")
continue
if len(node.targets) > 1:
log_ignored_line(node.lineno, 'only simple single-variable assignments allowed')
log_ignored_line(node.lineno, "only simple single-variable assignments allowed")
continue
target = node.targets[0]
if not isinstance(target, ast.Name):
log_ignored_line(node.lineno, 'only simple single-variable assignments allowed')
if type(target) != ast.Name:
log_ignored_line(node.lineno, "only simple single-variable assignments allowed")
continue
if target.id not in acceptable_targets:
log_ignored_line(node.lineno, f"{target.id} is not a valid setting")
log_ignored_line(node.lineno, target.id + " is not a valid setting")
continue
if type(node.value) not in attributes:
@@ -645,6 +631,6 @@ def settings_page():
for func, old_value, value in to_call:
func(old_value, value)
return flask.redirect(f'{util.URL_ORIGIN}/settings', 303)
return flask.redirect(util.URL_ORIGIN + '/settings', 303)
else:
flask.abort(400)

View File

@@ -11,7 +11,8 @@ import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
import youtube.proto as proto
from youtube.yt_data_extract.common import (
extract_item_info, extract_items,
extract_item_info, extract_items, extract_shorts_lockup_view_model_info,
extract_approx_int,
)
@@ -27,7 +28,7 @@ class TestChannelCtokenV5:
def _decode_outer(self, ctoken):
"""Decode the outer protobuf layer of a ctoken."""
raw = base64.urlsafe_b64decode(f'{ctoken}==')
raw = base64.urlsafe_b64decode(ctoken + '==')
return {fn: val for _, fn, val in proto.read_protobuf(raw)}
def test_shorts_token_generates_without_error(self):
@@ -57,59 +58,6 @@ class TestChannelCtokenV5:
assert t_shorts != t_streams
assert t_videos != t_streams
def test_include_shorts_false_adds_filter(self):
"""Test that include_shorts=False adds the shorts filter (field 104)."""
# Token with shorts included (default)
t_with_shorts = self.channel_ctoken_v5('UCtest', '1', '3', 'videos', include_shorts=True)
# Token with shorts excluded
t_without_shorts = self.channel_ctoken_v5('UCtest', '1', '3', 'videos', include_shorts=False)
# The tokens should be different because of the shorts filter
assert t_with_shorts != t_without_shorts
# Decode and verify the filter is present
raw_with_shorts = base64.urlsafe_b64decode(f'{t_with_shorts}==')
raw_without_shorts = base64.urlsafe_b64decode(f'{t_without_shorts}==')
# Parse the outer protobuf structure
import youtube.proto as proto
outer_fields_with = list(proto.read_protobuf(raw_with_shorts))
outer_fields_without = list(proto.read_protobuf(raw_without_shorts))
# Field 80226972 contains the inner data
inner_with = [v for _, fn, v in outer_fields_with if fn == 80226972][0]
inner_without = [v for _, fn, v in outer_fields_without if fn == 80226972][0]
# Parse the inner data - field 3 contains percent-encoded base64 data
inner_fields_with = list(proto.read_protobuf(inner_with))
inner_fields_without = list(proto.read_protobuf(inner_without))
# Get field 3 data (the encoded inner which is percent-encoded base64)
encoded_inner_with = [v for _, fn, v in inner_fields_with if fn == 3][0]
encoded_inner_without = [v for _, fn, v in inner_fields_without if fn == 3][0]
# The inner without shorts should contain field 104
# Decode the percent-encoded base64 data
import urllib.parse
decoded_with = urllib.parse.unquote(encoded_inner_with.decode('ascii'))
decoded_without = urllib.parse.unquote(encoded_inner_without.decode('ascii'))
# Decode the base64 data
decoded_with_bytes = base64.urlsafe_b64decode(f'{decoded_with}==')
decoded_without_bytes = base64.urlsafe_b64decode(f'{decoded_without}==')
# Parse the decoded protobuf data
fields_with = list(proto.read_protobuf(decoded_with_bytes))
fields_without = list(proto.read_protobuf(decoded_without_bytes))
field_numbers_with = [fn for _, fn, _ in fields_with]
field_numbers_without = [fn for _, fn, _ in fields_without]
# The 'with' version should NOT have field 104
assert 104 not in field_numbers_with
# The 'without' version SHOULD have field 104
assert 104 in field_numbers_without
# --- shortsLockupViewModel parsing ---

View File

@@ -39,8 +39,7 @@ class NewIdentityState():
self.new_identities_till_success -= 1
def fetch_url_response(self, *args, **kwargs):
def cleanup_func(response):
return None
cleanup_func = (lambda r: None)
if self.new_identities_till_success == 0:
return MockResponse(), cleanup_func
return MockResponse(body=html429, status=429), cleanup_func

View File

@@ -1,17 +1,14 @@
import logging
import os
import re
import traceback
from sys import exc_info
import flask
import jinja2
from flask import request
from flask_babel import Babel
from youtube import util
from .get_app_version import app_version
import flask
from flask import request
import jinja2
import settings
import traceback
import logging
import re
from sys import exc_info
from flask_babel import Babel
yt_app = flask.Flask(__name__)
yt_app.config['TEMPLATES_AUTO_RELOAD'] = True
@@ -29,6 +26,7 @@ yt_app.logger.addFilter(FetchErrorFilter())
# yt_app.jinja_env.lstrip_blocks = True
# Configure Babel for i18n
import os
yt_app.config['BABEL_DEFAULT_LOCALE'] = 'en'
# Use absolute path for translations directory to avoid issues with package structure changes
_app_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
@@ -76,7 +74,7 @@ theme_names = {
@yt_app.context_processor
def inject_theme_preference():
return {
'theme_path': f'/youtube.com/static/{theme_names[settings.theme]}.css',
'theme_path': '/youtube.com/static/' + theme_names[settings.theme] + '.css',
'settings': settings,
# Detect version
'current_version': app_version()['version'],
@@ -145,9 +143,9 @@ def error_page(e):
' exit node is overutilized. Try getting a new exit node by'
' using the New Identity button in the Tor Browser.')
if fetch_err.error_message:
error_message += f'\n\n{fetch_err.error_message}'
error_message += '\n\n' + fetch_err.error_message
if fetch_err.ip:
error_message += f'\n\nExit node IP address: {fetch_err.ip}'
error_message += '\n\nExit node IP address: ' + fetch_err.ip
return flask.render_template('error.html', error_message=error_message, slim=slim), 502
elif error_code == '429':
@@ -157,7 +155,7 @@ def error_page(e):
'• Enable Tor routing in Settings for automatic IP rotation\n'
'• Use a VPN to change your IP address')
if fetch_err.ip:
error_message += f'\n\nYour IP: {fetch_err.ip}'
error_message += '\n\nYour IP: ' + fetch_err.ip
return flask.render_template('error.html', error_message=error_message, slim=slim), 429
elif error_code == '502' and ('Failed to resolve' in str(fetch_err) or 'Failed to establish' in str(fetch_err)):
@@ -179,7 +177,7 @@ def error_page(e):
# Catch-all for any other FetchError (400, etc.)
error_message = f'Error communicating with YouTube ({error_code}).'
if fetch_err.error_message:
error_message += f'\n\n{fetch_err.error_message}'
error_message += '\n\n' + fetch_err.error_message
return flask.render_template('error.html', error_message=error_message, slim=slim), 502
return flask.render_template('error.html', traceback=traceback.format_exc(),

View File

@@ -6,7 +6,9 @@ import settings
import urllib
import json
from string import Template
import youtube.proto as proto
import html
import math
import gevent
import re
@@ -31,9 +33,9 @@ headers_mobile = (
real_cookie = (('Cookie', 'VISITOR_INFO1_LIVE=8XihrAcN1l4'),)
generic_cookie = (('Cookie', 'VISITOR_INFO1_LIVE=ST1Ti53r4fU'),)
# FIXED 2026: YouTube changed continuation token structure (from Invidious commit a9f8127)
# Sort values for YouTube API (from Invidious): 2=popular, 4=newest, 5=oldest
# include_shorts only applies to tab='videos'; tab='shorts'/'streams' always include their own content.
def channel_ctoken_v5(channel_id, page, sort, tab, view=1, include_shorts=True):
def channel_ctoken_v5(channel_id, page, sort, tab, view=1):
# Tab-specific protobuf field numbers (from Invidious source)
# Each tab uses different field numbers in the protobuf structure:
# videos: 110 -> 3 -> 15 -> { 2:{1:UUID}, 4:sort, 8:{1:UUID, 3:sort} }
@@ -72,11 +74,6 @@ def channel_ctoken_v5(channel_id, page, sort, tab, view=1, include_shorts=True):
inner_container = proto.string(3, tab_wrapper)
outer_container = proto.string(110, inner_container)
# Add shorts filter when include_shorts=False (field 104, same as playlist.py)
# This tells YouTube to exclude shorts from the results
if not include_shorts:
outer_container += proto.string(104, proto.uint(2, 1))
encoded_inner = proto.percent_b64encode(outer_container)
pointless_nest = proto.string(80226972,
@@ -239,12 +236,12 @@ def channel_ctoken_v1(channel_id, page, sort, tab, view=1):
def get_channel_tab(channel_id, page="1", sort=3, tab='videos', view=1,
ctoken=None, print_status=True, include_shorts=True):
ctoken=None, print_status=True):
message = 'Got channel tab' if print_status else None
if not ctoken:
if tab in ('videos', 'shorts', 'streams'):
ctoken = channel_ctoken_v5(channel_id, page, sort, tab, view, include_shorts)
ctoken = channel_ctoken_v5(channel_id, page, sort, tab, view)
else:
ctoken = channel_ctoken_v3(channel_id, page, sort, tab, view)
ctoken = ctoken.replace('=', '%3D')
@@ -253,7 +250,7 @@ def get_channel_tab(channel_id, page="1", sort=3, tab='videos', view=1,
# For now it seems to be constant for the API endpoint, not dependent
# on the browsing session or channel
key = 'AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8'
url = f'https://www.youtube.com/youtubei/v1/browse?key={key}'
url = 'https://www.youtube.com/youtubei/v1/browse?key=' + key
data = {
'context': {
@@ -285,36 +282,25 @@ def get_number_of_videos_channel(channel_id):
return 1000
# Uploads playlist
playlist_id = f'UU{channel_id[2:]}'
url = f'https://m.youtube.com/playlist?list={playlist_id}&pbj=1'
playlist_id = 'UU' + channel_id[2:]
url = 'https://m.youtube.com/playlist?list=' + playlist_id + '&pbj=1'
try:
response = util.fetch_url(url, headers_mobile,
debug_name='number_of_videos', report_text='Got number of videos')
except (urllib.error.HTTPError, util.FetchError):
except (urllib.error.HTTPError, util.FetchError) as e:
traceback.print_exc()
print("Couldn't retrieve number of videos")
return 1000
response = response.decode('utf-8')
# Try several patterns since YouTube's format changes:
# "numVideosText":{"runs":[{"text":"1,234"},{"text":" videos"}]}
# "stats":[..., {"runs":[{"text":"1,234"},{"text":" videos"}]}]
for pattern in (
r'"numVideosText".*?"text":\s*"([\d,]+)"',
r'"numVideosText".*?([\d,]+)\s*videos?',
r'"numVideosText".*?([,\d]+)',
r'([\d,]+)\s*videos?\s*</span>',
):
match = re.search(pattern, response)
if match:
try:
return int(match.group(1).replace(',', ''))
except ValueError:
continue
# Fallback: unknown count
return 0
# match = re.search(r'"numVideosText":\s*{\s*"runs":\s*\[{"text":\s*"([\d,]*) videos"', response)
match = re.search(r'"numVideosText".*?([,\d]+)', response)
if match:
return int(match.group(1).replace(',',''))
else:
return 0
def set_cached_number_of_videos(channel_id, num_videos):
@cachetools.cached(number_of_videos_cache)
def dummy_func_using_same_cache(channel_id):
@@ -328,7 +314,7 @@ def get_channel_id(base_url):
# method that gives the smallest possible response at ~4 kb
# needs to be as fast as possible
base_url = base_url.replace('https://www', 'https://m') # avoid redirect
response = util.fetch_url(f'{base_url}/about?pbj=1', headers_mobile,
response = util.fetch_url(base_url + '/about?pbj=1', headers_mobile,
debug_name='get_channel_id', report_text='Got channel id').decode('utf-8')
match = channel_id_re.search(response)
if match:
@@ -372,7 +358,7 @@ def get_channel_search_json(channel_id, query, page):
ctoken = base64.urlsafe_b64encode(proto.nested(80226972, ctoken)).decode('ascii')
key = 'AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8'
url = f'https://www.youtube.com/youtubei/v1/browse?key={key}'
url = 'https://www.youtube.com/youtubei/v1/browse?key=' + key
data = {
'context': {
@@ -414,18 +400,18 @@ def post_process_channel_info(info):
def get_channel_first_page(base_url=None, tab='videos', channel_id=None, sort=None):
if channel_id:
base_url = f'https://www.youtube.com/channel/{channel_id}'
base_url = 'https://www.youtube.com/channel/' + channel_id
# Build URL with sort parameter
# YouTube URL sort params: p=popular, dd=newest, lad=newest no shorts
# Note: 'da' (oldest) was removed by YouTube in January 2026
url = f'{base_url}/{tab}?pbj=1&view=0'
url = base_url + '/' + tab + '?pbj=1&view=0'
if sort:
# Map sort values to YouTube's URL parameter values
sort_map = {'3': 'dd', '4': 'lad'}
url += f'&sort={sort_map.get(sort, "dd")}'
url += '&sort=' + sort_map.get(sort, 'dd')
return util.fetch_url(url, headers_desktop, debug_name=f'gen_channel_{tab}')
return util.fetch_url(url, headers_desktop, debug_name='gen_channel_' + tab)
playlist_sort_codes = {'2': "da", '3': "dd", '4': "lad"}
@@ -439,35 +425,35 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
page_number = int(request.args.get('page', 1))
# sort 1: views
# sort 2: oldest
# sort 3: newest (includes shorts, via UU uploads playlist)
# sort 4: newest - no shorts (uses channel Videos tab API directly, like Invidious)
# sort 4: newest - no shorts (Just a kludge on our end, not internal to yt)
default_sort = '3' if settings.include_shorts_in_channel else '4'
sort = request.args.get('sort', default_sort)
view = request.args.get('view', '1')
query = request.args.get('query', '')
ctoken = request.args.get('ctoken', '')
include_shorts = (sort != '4')
default_params = (page_number == 1 and sort in ('3', '4') and view == '1')
continuation = bool(ctoken)
continuation = bool(ctoken) # whether or not we're using a continuation
page_size = 30
try_channel_api = True
polymer_json = None
number_of_videos = 0
info = None
# -------------------------------------------------------------------------
# sort=3: use UU uploads playlist (includes shorts)
# -------------------------------------------------------------------------
if tab == 'videos' and sort == '3':
# Use the special UU playlist which contains all the channel's uploads
if tab == 'videos' and sort in ('3', '4'):
if not channel_id:
channel_id = get_channel_id(base_url)
if page_number == 1:
if page_number == 1 and include_shorts:
tasks = (
gevent.spawn(playlist.playlist_first_page,
f'UU{channel_id[2:]}',
'UU' + channel_id[2:],
report_text='Retrieved channel videos'),
gevent.spawn(get_metadata, channel_id),
)
gevent.joinall(tasks)
util.check_gevent_exceptions(*tasks)
# Ignore the metadata for now, it is cached and will be
# recalled later
pl_json = tasks[0].value
pl_info = yt_data_extract.extract_playlist_info(pl_json)
number_of_videos = pl_info['metadata']['video_count']
@@ -477,71 +463,87 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
set_cached_number_of_videos(channel_id, number_of_videos)
else:
tasks = (
gevent.spawn(playlist.get_videos, f'UU{channel_id[2:]}',
page_number, include_shorts=True),
gevent.spawn(playlist.get_videos, 'UU' + channel_id[2:],
page_number, include_shorts=include_shorts),
gevent.spawn(get_metadata, channel_id),
gevent.spawn(get_number_of_videos_channel, channel_id),
gevent.spawn(playlist.playlist_first_page, f'UU{channel_id[2:]}',
report_text='Retrieved channel video count'),
)
gevent.joinall(tasks)
util.check_gevent_exceptions(*tasks)
pl_json = tasks[0].value
pl_info = yt_data_extract.extract_playlist_info(pl_json)
first_page_meta = yt_data_extract.extract_playlist_metadata(tasks[3].value)
number_of_videos = (tasks[2].value
or first_page_meta.get('video_count')
or 0)
number_of_videos = tasks[2].value
if pl_info['items']:
info = pl_info
info['channel_id'] = channel_id
info['current_tab'] = 'videos'
info = pl_info
info['channel_id'] = channel_id
info['current_tab'] = 'videos'
if info['items']: # Success
page_size = 100
# else fall through to the channel browse API below
try_channel_api = False
else: # Try the first-page method next
try_channel_api = True
# -------------------------------------------------------------------------
# Channel browse API: sort=4 (videos tab, no shorts), shorts, streams,
# or fallback when the UU playlist returned no items.
# Uses channel_ctoken_v5 per-tab tokens, mirroring Invidious's approach.
# Pagination is driven by the continuation token YouTube returns each page.
# -------------------------------------------------------------------------
used_channel_api = False
if info is None and (
tab in ('shorts', 'streams')
or (tab == 'videos' and sort == '4')
or (tab == 'videos' and sort == '3') # UU-playlist fallback
):
# Use the regular channel API
if tab in ('shorts', 'streams') or (tab=='videos' and try_channel_api):
if not channel_id:
channel_id = get_channel_id(base_url)
used_channel_api = True
# Determine what browse call to make
if ctoken:
browse_call = (util.call_youtube_api, 'web', 'browse',
{'continuation': ctoken})
continuation = True
elif page_number > 1:
cache_key = (channel_id, tab, sort, page_number - 1)
cached_ctoken = continuation_token_cache.get(cache_key)
if cached_ctoken:
browse_call = (util.call_youtube_api, 'web', 'browse',
{'continuation': cached_ctoken})
# For shorts/streams, use continuation token from cache or request
if tab in ('shorts', 'streams'):
if ctoken:
# Use ctoken directly from request (passed via pagination)
polymer_json = util.call_youtube_api('web', 'browse', {
'continuation': ctoken,
})
continuation = True
elif page_number > 1:
# For page 2+, get ctoken from cache
cache_key = (channel_id, tab, sort, page_number - 1)
cached_ctoken = continuation_token_cache.get(cache_key)
if cached_ctoken:
polymer_json = util.call_youtube_api('web', 'browse', {
'continuation': cached_ctoken,
})
continuation = True
else:
# Fallback: generate fresh ctoken
page_call = (get_channel_tab, channel_id, str(page_number), sort, tab, int(view))
continuation = True
polymer_json = gevent.spawn(*page_call)
polymer_json.join()
if polymer_json.exception:
raise polymer_json.exception
polymer_json = polymer_json.value
else:
# Cache miss — restart from page 1 (better than an error)
browse_call = (get_channel_tab, channel_id, '1', sort, tab, int(view))
continuation = True
# Page 1: generate fresh ctoken
page_call = (get_channel_tab, channel_id, str(page_number), sort, tab, int(view))
continuation = True
polymer_json = gevent.spawn(*page_call)
polymer_json.join()
if polymer_json.exception:
raise polymer_json.exception
polymer_json = polymer_json.value
else:
browse_call = (get_channel_tab, channel_id, '1', sort, tab, int(view))
# videos tab - original logic
page_call = (get_channel_tab, channel_id, str(page_number), sort,
tab, int(view))
continuation = True
# Single browse call; number_of_videos is computed from items actually
# fetched so we don't mislead the user with a total that includes
# shorts (which this branch is explicitly excluding for sort=4).
task = gevent.spawn(*browse_call)
task.join()
util.check_gevent_exceptions(task)
polymer_json = task.value
if tab == 'videos':
# Only need video count for the videos tab
if channel_id:
num_videos_call = (get_number_of_videos_channel, channel_id)
else:
num_videos_call = (get_number_of_videos_general, base_url)
tasks = (
gevent.spawn(*num_videos_call),
gevent.spawn(*page_call),
)
gevent.joinall(tasks)
util.check_gevent_exceptions(*tasks)
number_of_videos, polymer_json = tasks[0].value, tasks[1].value
# For shorts/streams, polymer_json is already set above, nothing to do here
elif tab == 'about':
# polymer_json = util.fetch_url(base_url + '/about?pbj=1', headers_desktop, debug_name='gen_channel_about')
@@ -567,23 +569,23 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
elif tab == 'search' and channel_id:
polymer_json = get_channel_search_json(channel_id, query, page_number)
elif tab == 'search':
url = f'{base_url}/search?pbj=1&query={urllib.parse.quote(query, safe="")}'
url = base_url + '/search?pbj=1&query=' + urllib.parse.quote(query, safe='')
polymer_json = util.fetch_url(url, headers_desktop, debug_name='gen_channel_search')
elif tab != 'videos':
flask.abort(404, f'Unknown channel tab: {tab}')
elif tab == 'videos':
pass
else:
flask.abort(404, 'Unknown channel tab: ' + tab)
if polymer_json is not None and info is None:
if polymer_json is not None:
info = yt_data_extract.extract_channel_info(
json.loads(polymer_json), tab, continuation=continuation
)
if info is None:
return flask.render_template('error.html', error_message='Could not retrieve channel data')
if info['error'] is not None:
return flask.render_template('error.html', error_message=info['error'])
if channel_id:
info['channel_url'] = f'https://www.youtube.com/channel/{channel_id}'
info['channel_url'] = 'https://www.youtube.com/channel/' + channel_id
info['channel_id'] = channel_id
else:
channel_id = info['channel_id']
@@ -608,40 +610,16 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
item.update(additional_info)
if tab in ('videos', 'shorts', 'streams'):
# For any tab using the channel browse API (sort=4, shorts, streams),
# pagination is driven by the ctoken YouTube returns in the response.
# Cache it so the next page request can use it.
if info.get('ctoken'):
cache_key = (channel_id, tab, sort, page_number)
continuation_token_cache[cache_key] = info['ctoken']
# Determine is_last_page and final number_of_pages.
# For channel-API-driven tabs (sort=4, shorts, streams, UU fallback),
# YouTube doesn't give us a reliable total filtered count. So instead
# of displaying a misleading number (the total-including-shorts from
# get_number_of_videos_channel), we count only what we've actually
# paged through, and use the ctoken to know whether to show "next".
if used_channel_api:
if tab in ('shorts', 'streams'):
# For shorts/streams, use ctoken to determine pagination
info['is_last_page'] = (info.get('ctoken') is None)
items_on_page = len(info.get('items', []))
items_seen_so_far = (page_number - 1) * page_size + items_on_page
# Use accumulated count as the displayed total so "N videos" shown
# to the user always matches what they could actually reach.
number_of_videos = items_seen_so_far
# If there's more content, bump by 1 so the Next-page button exists
number_of_videos = len(info.get('items', []))
# Cache the ctoken for next page
if info.get('ctoken'):
number_of_videos = max(number_of_videos,
page_number * page_size + 1)
# For sort=3 via UU playlist (used_channel_api=False), number_of_videos
# was already set from playlist metadata above.
cache_key = (channel_id, tab, sort, page_number)
continuation_token_cache[cache_key] = info['ctoken']
info['number_of_videos'] = number_of_videos
info['number_of_pages'] = math.ceil(number_of_videos / page_size) if number_of_videos else 1
# Never show fewer pages than the page the user is actually on
if info['number_of_pages'] < page_number:
info['number_of_pages'] = page_number
info['number_of_pages'] = math.ceil(number_of_videos/page_size) if number_of_videos else 1
info['header_playlist_names'] = local_playlist.get_playlist_names()
if tab in ('videos', 'shorts', 'streams', 'playlists'):
info['current_sort'] = sort
@@ -663,22 +641,22 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
@yt_app.route('/channel/<channel_id>/')
@yt_app.route('/channel/<channel_id>/<tab>')
def get_channel_page(channel_id, tab='videos'):
return get_channel_page_general_url(f'https://www.youtube.com/channel/{channel_id}', tab, request, channel_id)
return get_channel_page_general_url('https://www.youtube.com/channel/' + channel_id, tab, request, channel_id)
@yt_app.route('/user/<username>/')
@yt_app.route('/user/<username>/<tab>')
def get_user_page(username, tab='videos'):
return get_channel_page_general_url(f'https://www.youtube.com/user/{username}', tab, request)
return get_channel_page_general_url('https://www.youtube.com/user/' + username, tab, request)
@yt_app.route('/c/<custom>/')
@yt_app.route('/c/<custom>/<tab>')
def get_custom_c_page(custom, tab='videos'):
return get_channel_page_general_url(f'https://www.youtube.com/c/{custom}', tab, request)
return get_channel_page_general_url('https://www.youtube.com/c/' + custom, tab, request)
@yt_app.route('/<custom>')
@yt_app.route('/<custom>/<tab>')
def get_toplevel_custom_page(custom, tab='videos'):
return get_channel_page_general_url(f'https://www.youtube.com/{custom}', tab, request)
return get_channel_page_general_url('https://www.youtube.com/' + custom, tab, request)

View File

@@ -104,19 +104,20 @@ def post_process_comments_info(comments_info):
comment['replies_url'] = None
comment['replies_url'] = concat_or_none(
util.URL_ORIGIN,
f'/comments?replies=1&ctoken={ctoken}')
'/comments?replies=1&ctoken=' + ctoken)
if reply_count == 0:
comment['view_replies_text'] = 'Reply'
elif reply_count == 1:
comment['view_replies_text'] = '1 reply'
else:
comment['view_replies_text'] = f'{reply_count} replies'
comment['view_replies_text'] = str(reply_count) + ' replies'
if comment['approx_like_count'] == '1':
comment['likes_text'] = '1 like'
else:
comment['likes_text'] = f"{comment['approx_like_count']} likes"
comment['likes_text'] = (str(comment['approx_like_count'])
+ ' likes')
comments_info['include_avatars'] = settings.enable_comment_avatars
if comments_info['ctoken']:
@@ -154,48 +155,48 @@ def post_process_comments_info(comments_info):
def video_comments(video_id, sort=0, offset=0, lc='', secret_key=''):
if not settings.comments_mode:
return {}
# Initialize the result dict up-front so that any exception path below
# can safely attach an 'error' field without risking UnboundLocalError.
comments_info = {'error': None}
try:
other_sort_url = (
f"{util.URL_ORIGIN}/comments?ctoken="
f"{make_comment_ctoken(video_id, sort=1 - sort, lc=lc)}"
)
other_sort_text = f'Sort by {"newest" if sort == 0 else "top"}'
if settings.comments_mode:
comments_info = {'error': None}
other_sort_url = (
util.URL_ORIGIN + '/comments?ctoken='
+ make_comment_ctoken(video_id, sort=1 - sort, lc=lc)
)
other_sort_text = 'Sort by ' + ('newest' if sort == 0 else 'top')
this_sort_url = (f"{util.URL_ORIGIN}/comments?ctoken="
f"{make_comment_ctoken(video_id, sort=sort, lc=lc)}")
this_sort_url = (util.URL_ORIGIN
+ '/comments?ctoken='
+ make_comment_ctoken(video_id, sort=sort, lc=lc))
comments_info['comment_links'] = [
(other_sort_text, other_sort_url),
('Direct link', this_sort_url)
]
comments_info['comment_links'] = [
(other_sort_text, other_sort_url),
('Direct link', this_sort_url)
]
ctoken = make_comment_ctoken(video_id, sort, offset, lc)
comments_info.update(yt_data_extract.extract_comments_info(
request_comments(ctoken), ctoken=ctoken
))
post_process_comments_info(comments_info)
ctoken = make_comment_ctoken(video_id, sort, offset, lc)
comments_info.update(yt_data_extract.extract_comments_info(
request_comments(ctoken), ctoken=ctoken
))
post_process_comments_info(comments_info)
return comments_info
return comments_info
else:
return {}
except util.FetchError as e:
if e.code == '429' and settings.route_tor:
comments_info['error'] = 'Error: YouTube blocked the request because the Tor exit node is overutilized.'
if e.error_message:
comments_info['error'] += f'\n\n{e.error_message}'
comments_info['error'] += f'\n\nExit node IP address: {e.ip}'
comments_info['error'] += '\n\n' + e.error_message
comments_info['error'] += '\n\nExit node IP address: %s' % e.ip
else:
comments_info['error'] = f'YouTube blocked the request. Error: {e}'
comments_info['error'] = 'YouTube blocked the request. Error: %s' % str(e)
except Exception as e:
comments_info['error'] = f'YouTube blocked the request. Error: {e}'
comments_info['error'] = 'YouTube blocked the request. Error: %s' % str(e)
if comments_info.get('error'):
print(f'Error retrieving comments for {video_id}:\n{comments_info["error"]}')
print('Error retrieving comments for ' + str(video_id) + ':\n' +
comments_info['error'])
return comments_info
@@ -215,10 +216,12 @@ def get_comments_page():
other_sort_url = None
else:
other_sort_url = (
f'{util.URL_ORIGIN}/comments?ctoken='
f'{make_comment_ctoken(comments_info["video_id"], sort=1-comments_info["sort"])}'
util.URL_ORIGIN
+ '/comments?ctoken='
+ make_comment_ctoken(comments_info['video_id'],
sort=1-comments_info['sort'])
)
other_sort_text = f'Sort by {"newest" if comments_info["sort"] == 0 else "top"}'
other_sort_text = 'Sort by ' + ('newest' if comments_info['sort'] == 0 else 'top')
comments_info['comment_links'] = [(other_sort_text, other_sort_url)]
return flask.render_template(

View File

@@ -1,3 +1 @@
from .get_app_version import app_version
__all__ = ['app_version']
from .get_app_version import *

View File

@@ -1,9 +1,11 @@
from __future__ import unicode_literals
import os
import shutil
import subprocess
from subprocess import (
call,
STDOUT
)
from ..version import __version__
import os
import subprocess
def app_version():
@@ -11,46 +13,35 @@ def app_version():
# make minimal environment
env = {k: os.environ[k] for k in ['SYSTEMROOT', 'PATH'] if k in os.environ}
env.update({'LANGUAGE': 'C', 'LANG': 'C', 'LC_ALL': 'C'})
out = subprocess.Popen(cmd, stdout=subprocess.PIPE, env=env).communicate()[0]
return out
subst_list = {
'version': __version__,
'branch': None,
'commit': None,
"version": __version__,
"branch": None,
"commit": None
}
# Use shutil.which instead of `command -v`/os.system so we don't spawn a
# shell (CWE-78 hardening) and so it works cross-platform.
if shutil.which('git') is None:
if os.system("command -v git > /dev/null 2>&1") != 0:
return subst_list
try:
# Check we are inside a git work tree. Using DEVNULL avoids the
# file-handle leak from `open(os.devnull, 'w')`.
rc = subprocess.call(
['git', 'branch'],
stderr=subprocess.DEVNULL,
stdout=subprocess.DEVNULL,
)
except OSError:
return subst_list
if rc != 0:
if call(["git", "branch"], stderr=STDOUT, stdout=open(os.devnull, 'w')) != 0:
return subst_list
describe = minimal_env_cmd(['git', 'describe', '--tags', '--always'])
describe = minimal_env_cmd(["git", "describe", "--tags", "--always"])
git_revision = describe.strip().decode('ascii')
branch = minimal_env_cmd(['git', 'branch'])
branch = minimal_env_cmd(["git", "branch"])
git_branch = branch.strip().decode('ascii').replace('* ', '')
subst_list.update({
'branch': git_branch,
'commit': git_revision,
"branch": git_branch,
"commit": git_revision
})
return subst_list
if __name__ == '__main__':
if __name__ == "__main__":
app_version()

View File

@@ -1,42 +1,28 @@
from youtube import util
from youtube import util, yt_data_extract
from youtube import yt_app
import settings
import os
import json
import html
import gevent
import urllib
import math
import glob
import re
import flask
from flask import request
playlists_directory = os.path.join(settings.data_dir, 'playlists')
thumbnails_directory = os.path.join(settings.data_dir, 'playlist_thumbnails')
# Whitelist accepted playlist names so user input cannot escape
# `playlists_directory` / `thumbnails_directory` (CWE-22, OWASP A01:2021).
# Allow letters, digits, spaces, dot, dash and underscore.
_PLAYLIST_NAME_RE = re.compile(r'^[\w .\-]{1,128}$')
def _validate_playlist_name(name):
'''Return the stripped name if safe, otherwise abort with 400.'''
if name is None:
flask.abort(400)
name = name.strip()
if not _PLAYLIST_NAME_RE.match(name):
flask.abort(400)
return name
playlists_directory = os.path.join(settings.data_dir, "playlists")
thumbnails_directory = os.path.join(settings.data_dir, "playlist_thumbnails")
def _find_playlist_path(name):
'''Find playlist file robustly, handling trailing spaces in filenames'''
name = _validate_playlist_name(name)
pattern = os.path.join(playlists_directory, name + '*.txt')
"""Find playlist file robustly, handling trailing spaces in filenames"""
name = name.strip()
pattern = os.path.join(playlists_directory, name + "*.txt")
files = glob.glob(pattern)
return files[0] if files else os.path.join(playlists_directory, name + '.txt')
return files[0] if files else os.path.join(playlists_directory, name + ".txt")
def _parse_playlist_lines(data):
@@ -92,7 +78,9 @@ def add_extra_info_to_videos(videos, playlist_name):
util.add_extra_html_info(video)
if video['id'] + '.jpg' in thumbnails:
video['thumbnail'] = (
f'/https://youtube.com/data/playlist_thumbnails/{playlist_name}/{video["id"]}.jpg')
'/https://youtube.com/data/playlist_thumbnails/'
+ playlist_name
+ '/' + video['id'] + '.jpg')
else:
video['thumbnail'] = util.get_thumbnail_url(video['id'])
missing_thumbnails.append(video['id'])
@@ -191,9 +179,8 @@ def path_edit_playlist(playlist_name):
redirect_page_number = min(int(request.values.get('page', 1)), math.ceil(number_of_videos_remaining/50))
return flask.redirect(util.URL_ORIGIN + request.path + '?page=' + str(redirect_page_number))
elif request.values['action'] == 'remove_playlist':
safe_name = _validate_playlist_name(playlist_name)
try:
os.remove(os.path.join(playlists_directory, safe_name + '.txt'))
os.remove(os.path.join(playlists_directory, playlist_name + ".txt"))
except OSError:
pass
return flask.redirect(util.URL_ORIGIN + '/playlists')
@@ -233,17 +220,8 @@ def edit_playlist():
flask.abort(400)
_THUMBNAIL_RE = re.compile(r'^[A-Za-z0-9_-]{11}\.jpg$')
@yt_app.route('/data/playlist_thumbnails/<playlist_name>/<thumbnail>')
def serve_thumbnail(playlist_name, thumbnail):
# Validate both path components so a crafted URL cannot escape
# `thumbnails_directory` via `..` or NUL tricks (CWE-22).
safe_name = _validate_playlist_name(playlist_name)
if not _THUMBNAIL_RE.match(thumbnail):
flask.abort(400)
# .. is necessary because flask always uses the application directory at
# ./youtube, not the working directory.
# .. is necessary because flask always uses the application directory at ./youtube, not the working directory
return flask.send_from_directory(
os.path.join('..', thumbnails_directory, safe_name), thumbnail)
os.path.join('..', thumbnails_directory, playlist_name), thumbnail)

View File

@@ -3,7 +3,9 @@ from youtube import yt_app
import settings
import base64
import urllib
import json
import string
import gevent
import math
from flask import request, abort
@@ -20,7 +22,7 @@ def playlist_ctoken(playlist_id, offset, include_shorts=True):
continuation_info = proto.string(3, proto.percent_b64encode(offset))
playlist_id = proto.string(2, f'VL{playlist_id}')
playlist_id = proto.string(2, 'VL' + playlist_id)
pointless_nest = proto.string(80226972, playlist_id + continuation_info)
return base64.urlsafe_b64encode(pointless_nest).decode('ascii')
@@ -30,7 +32,7 @@ def playlist_first_page(playlist_id, report_text="Retrieved playlist",
use_mobile=False):
# Use innertube API (pbj=1 no longer works for many playlists)
key = 'AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8'
url = f'https://www.youtube.com/youtubei/v1/browse?key={key}'
url = 'https://www.youtube.com/youtubei/v1/browse?key=' + key
data = {
'context': {
@@ -41,7 +43,7 @@ def playlist_first_page(playlist_id, report_text="Retrieved playlist",
'clientVersion': '2.20240327.00.00',
},
},
'browseId': f'VL{playlist_id}',
'browseId': 'VL' + playlist_id,
}
content_type_header = (('Content-Type', 'application/json'),)
@@ -58,7 +60,7 @@ def get_videos(playlist_id, page, include_shorts=True, use_mobile=False,
page_size = 100
key = 'AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8'
url = f'https://www.youtube.com/youtubei/v1/browse?key={key}'
url = 'https://www.youtube.com/youtubei/v1/browse?key=' + key
ctoken = playlist_ctoken(playlist_id, (int(page)-1)*page_size,
include_shorts=include_shorts)
@@ -97,7 +99,7 @@ def get_playlist_page():
if playlist_id.startswith('RD'):
first_video_id = playlist_id[2:] # video ID after 'RD' prefix
return flask.redirect(
f'{util.URL_ORIGIN}/watch?v={first_video_id}&list={playlist_id}',
util.URL_ORIGIN + '/watch?v=' + first_video_id + '&list=' + playlist_id,
302
)
@@ -132,9 +134,9 @@ def get_playlist_page():
if 'id' in item and not item.get('thumbnail'):
item['thumbnail'] = f"{settings.img_prefix}https://i.ytimg.com/vi/{item['id']}/hqdefault.jpg"
item['url'] += f'&list={playlist_id}'
item['url'] += '&list=' + playlist_id
if item['index']:
item['url'] += f'&index={item["index"]}'
item['url'] += '&index=' + str(item['index'])
video_count = yt_data_extract.deep_get(info, 'metadata', 'video_count')
if video_count is None:

View File

@@ -76,7 +76,7 @@ def read_varint(data):
except IndexError:
if i == 0:
raise EOFError()
raise Exception(f'Unterminated varint starting at {data.tell() - i}')
raise Exception('Unterminated varint starting at ' + str(data.tell() - i))
result |= (byte & 127) << 7*i
if not byte & 128:
break
@@ -118,7 +118,7 @@ def read_protobuf(data):
elif wire_type == 5:
value = data.read(4)
else:
raise Exception(f"Unknown wire type: {wire_type} at position {data.tell()}")
raise Exception("Unknown wire type: " + str(wire_type) + " at position " + str(data.tell()))
yield (wire_type, field_number, value)
@@ -170,7 +170,8 @@ def _make_protobuf(data):
elif field[0] == 2:
result += string(field[1], _make_protobuf(field[2]))
else:
raise NotImplementedError(f'Wire type {field[0]} not implemented')
raise NotImplementedError('Wire type ' + str(field[0])
+ ' not implemented')
return result
return data
@@ -217,4 +218,4 @@ def b64_to_bytes(data):
if isinstance(data, bytes):
data = data.decode('ascii')
data = data.replace("%3D", "=")
return base64.urlsafe_b64decode(f'{data}={"=" * ((4 - len(data) % 4) % 4)}')
return base64.urlsafe_b64decode(data + "="*((4 - len(data) % 4) % 4))

View File

@@ -179,7 +179,7 @@ def read_varint(data):
except IndexError:
if i == 0:
raise EOFError()
raise Exception(f'Unterminated varint starting at {data.tell() - i}')
raise Exception('Unterminated varint starting at ' + str(data.tell() - i))
result |= (byte & 127) << 7*i
if not byte & 128:
break
@@ -235,7 +235,8 @@ def _make_protobuf(data):
elif field[0] == 2:
result += string(field[1], _make_protobuf(field[2]))
else:
raise NotImplementedError(f'Wire type {field[0]} not implemented')
raise NotImplementedError('Wire type ' + str(field[0])
+ ' not implemented')
return result
return data
@@ -285,7 +286,7 @@ def b64_to_bytes(data):
if isinstance(data, bytes):
data = data.decode('ascii')
data = data.replace("%3D", "=")
return base64.urlsafe_b64decode(f'{data}={"=" * ((4 - len(data) % 4) % 4)}')
return base64.urlsafe_b64decode(data + "="*((4 - len(data) % 4) % 4))
# --------------------------------------------------------------------
@@ -343,7 +344,7 @@ fromhex = bytes.fromhex
def aligned_ascii(data):
return ' '.join(f' {chr(n)}' if n in range(32, 128) else ' _' for n in data)
return ' '.join(' ' + chr(n) if n in range(32, 128) else ' _' for n in data)
def parse_protobuf(data, mutable=False, spec=()):
@@ -371,7 +372,7 @@ def parse_protobuf(data, mutable=False, spec=()):
elif wire_type == 5:
value = data.read(4)
else:
raise Exception(f"Unknown wire type: {wire_type}, Tag: {bytes_to_hex(varint_encode(tag))}, at position {data.tell()}")
raise Exception("Unknown wire type: " + str(wire_type) + ", Tag: " + bytes_to_hex(varint_encode(tag)) + ", at position " + str(data.tell()))
if mutable:
yield [wire_type, field_number, value]
else:
@@ -452,7 +453,7 @@ def b32decode(s, casefold=False, map01=None):
if map01 is not None:
map01 = _bytes_from_decode_data(map01)
assert len(map01) == 1, repr(map01)
s = s.translate(bytes.maketrans(b'01', f'O{map01.decode("ascii")}'))
s = s.translate(bytes.maketrans(b'01', b'O' + map01))
if casefold:
s = s.upper()
# Strip off pad characters from the right. We need to count the pad
@@ -493,7 +494,7 @@ def b32decode(s, casefold=False, map01=None):
def dec32(data):
if isinstance(data, bytes):
data = data.decode('ascii')
return b32decode(f'{data}={"=" * ((8 - len(data)%8)%8)}')
return b32decode(data + "="*((8 - len(data)%8)%8))
_patterns = [
@@ -562,7 +563,9 @@ def _pp(obj, indent): # not my best work
if len(obj) == 3: # (wire_type, field_number, data)
return obj.__repr__()
else: # (base64, [...])
return f"({obj[0].__repr__()},\n{indent_lines(_pp(obj[1], indent), indent)}\n)"
return ('(' + obj[0].__repr__() + ',\n'
+ indent_lines(_pp(obj[1], indent), indent) + '\n'
+ ')')
elif isinstance(obj, list):
# [wire_type, field_number, data]
if (len(obj) == 3
@@ -574,11 +577,13 @@ def _pp(obj, indent): # not my best work
elif (len(obj) == 3
and not any(isinstance(x, (list, tuple)) for x in obj[0:2])
):
return f"[{obj[0].__repr__()}, {obj[1].__repr__()},\n{indent_lines(_pp(obj[2], indent), indent)}\n]"
return ('[' + obj[0].__repr__() + ', ' + obj[1].__repr__() + ',\n'
+ indent_lines(_pp(obj[2], indent), indent) + '\n'
+ ']')
else:
s = '[\n'
for x in obj:
s += f"{indent_lines(_pp(x, indent), indent)},\n"
s += indent_lines(_pp(x, indent), indent) + ',\n'
s += ']'
return s
else:

View File

@@ -5,6 +5,7 @@ import settings
import json
import urllib
import base64
import mimetypes
from flask import request
import flask
import os
@@ -51,7 +52,7 @@ def get_search_json(query, page, autocorrect, sort, filters):
'X-YouTube-Client-Name': '1',
'X-YouTube-Client-Version': '2.20180418',
}
url += f"&pbj=1&sp={page_number_to_sp_parameter(page, autocorrect, sort, filters).replace('=', '%3D')}"
url += "&pbj=1&sp=" + page_number_to_sp_parameter(page, autocorrect, sort, filters).replace("=", "%3D")
content = util.fetch_url(url, headers=headers, report_text="Got search results", debug_name='search_results')
info = json.loads(content)
return info

View File

@@ -9,8 +9,6 @@
--thumb-background: #222222;
--link: #00B0FF;
--link-visited: #40C4FF;
--border-color: #333333;
--thead-background: #0a0a0b;
--border-bg: #222222;
--border-bg-settings: #000000;
--border-bg-license: #000000;

View File

@@ -9,8 +9,6 @@
--thumb-background: #35404D;
--link: #22AAFF;
--link-visited: #7755FF;
--border-color: #4A5568;
--thead-background: #1a2530;
--border-bg: #FFFFFF;
--border-bg-settings: #FFFFFF;
--border-bg-license: #FFFFFF;

View File

@@ -9,8 +9,6 @@
--thumb-background: #F5F5F5;
--link: #212121;
--link-visited: #808080;
--border-color: #CCCCCC;
--thead-background: #d0d0d0;
--border-bg: #212121;
--border-bg-settings: #91918C;
--border-bg-license: #91918C;

View File

@@ -307,122 +307,18 @@ figure.sc-video {
padding-top: 0.5rem;
padding-bottom: 0.5rem;
}
.v-download {
grid-area: v-download;
margin-bottom: 0.5rem;
.v-download { grid-area: v-download; }
.v-download > ul.download-dropdown-content {
background: var(--secondary-background);
padding-left: 0px;
}
.v-download details {
display: block;
width: 100%;
}
.v-download > summary {
cursor: pointer;
.v-download > ul.download-dropdown-content > li.download-format {
list-style: none;
padding: 0.4rem 0;
padding-left: 1rem;
}
.v-download > summary.download-dropdown-label {
cursor: pointer;
-webkit-touch-callout: none;
-webkit-user-select: none;
-khtml-user-select: none;
-moz-user-select: none;
-ms-user-select: none;
user-select: none;
padding-bottom: 6px;
padding-left: .75em;
padding-right: .75em;
padding-top: 6px;
text-align: center;
white-space: nowrap;
background-color: var(--buttom);
border: 1px solid var(--button-border);
color: var(--buttom-text);
border-radius: 5px;
margin-bottom: 0.5rem;
}
.v-download > summary.download-dropdown-label:hover {
background-color: var(--buttom-hover);
}
.v-download > .download-table-container {
background: var(--secondary-background);
max-height: 65vh;
overflow-y: auto;
border: 1px solid var(--button-border);
border-radius: 8px;
box-shadow: 0 4px 12px rgba(0,0,0,0.15);
}
.download-table {
width: 100%;
border-collapse: separate;
border-spacing: 0;
font-size: 0.875rem;
}
.download-table thead {
background: var(--thead-background);
position: sticky;
top: 0;
z-index: 1;
}
.download-table th,
.download-table td {
padding: 0.7rem 0.9rem;
text-align: left;
border-bottom: 1px solid var(--button-border);
}
.download-table th {
font-weight: 600;
font-size: 0.7rem;
text-transform: uppercase;
letter-spacing: 0.8px;
}
.download-table tbody tr {
transition: all 0.2s ease;
}
.download-table tbody tr:hover {
background: var(--primary-background);
}
.download-table a.download-link {
display: inline-block;
padding: 0.4rem 0.85rem;
background: rgba(0,0,0,0.12);
color: var(--buttom-text);
.v-download > ul.download-dropdown-content > li.download-format a.download-link {
text-decoration: none;
border-radius: 5px;
font-weight: 500;
font-size: 0.85rem;
transition: background 0.2s ease;
white-space: nowrap;
}
.download-table a.download-link:hover {
background: rgba(0,0,0,0.28);
color: var(--buttom-text);
}
.download-table tbody tr:last-child td {
border-bottom: none;
}
.download-table td[data-label="Ext"] {
font-family: monospace;
font-size: 0.8rem;
font-weight: 600;
}
.download-table td[data-label="Link"] {
white-space: nowrap;
vertical-align: middle;
}
.download-table td[data-label="Codecs"] {
max-width: 180px;
text-overflow: ellipsis;
overflow: hidden;
font-family: monospace;
font-size: 0.75rem;
}
.download-table td[data-label="Size"] {
font-family: monospace;
font-size: 0.85rem;
}
.download-table td[colspan="3"] {
font-style: italic;
opacity: 0.7;
}
.v-description {

View File

@@ -126,7 +126,7 @@ def delete_thumbnails(to_delete):
os.remove(os.path.join(thumbnails_directory, thumbnail))
existing_thumbnails.remove(video_id)
except Exception:
print(f'Failed to delete thumbnail: {thumbnail}')
print('Failed to delete thumbnail: ' + thumbnail)
traceback.print_exc()
@@ -184,7 +184,7 @@ def _get_videos(cursor, number_per_page, offset, tag=None):
'time_published': exact_timestamp(db_video[3]) if db_video[4] else posix_to_dumbed_down(db_video[3]),
'author': db_video[5],
'author_id': db_video[6],
'author_url': f'/https://www.youtube.com/channel/{db_video[6]}',
'author_url': '/https://www.youtube.com/channel/' + db_video[6],
})
return videos, pseudo_number_of_videos
@@ -292,10 +292,7 @@ def youtube_timestamp_to_posix(dumb_timestamp):
def posix_to_dumbed_down(posix_time):
'''Inverse of youtube_timestamp_to_posix.'''
delta = int(time.time() - posix_time)
# Guard against future timestamps (clock drift) without relying on
# `assert` (which is stripped under `python -O`).
if delta < 0:
delta = 0
assert delta >= 0
if delta == 0:
return '0 seconds ago'
@@ -304,9 +301,9 @@ def posix_to_dumbed_down(posix_time):
if delta >= unit_time:
quantifier = round(delta/unit_time)
if quantifier == 1:
return f'1 {unit_name} ago'
return '1 ' + unit_name + ' ago'
else:
return f'{quantifier} {unit_name}s ago'
return str(quantifier) + ' ' + unit_name + 's ago'
else:
raise Exception()
@@ -363,7 +360,7 @@ def autocheck_dispatcher():
time_until_earliest_job = earliest_job['next_check_time'] - time.time()
if time_until_earliest_job <= -5: # should not happen unless we're running extremely slow
print(f'ERROR: autocheck_dispatcher got job scheduled in the past, skipping and rescheduling: {earliest_job["channel_id"]}, {earliest_job["channel_name"]}, {earliest_job["next_check_time"]}')
print('ERROR: autocheck_dispatcher got job scheduled in the past, skipping and rescheduling: ' + earliest_job['channel_id'] + ', ' + earliest_job['channel_name'] + ', ' + str(earliest_job['next_check_time']))
next_check_time = time.time() + 3600*secrets.randbelow(60)/60
with_open_db(_schedule_checking, earliest_job['channel_id'], next_check_time)
autocheck_jobs[earliest_job_index]['next_check_time'] = next_check_time
@@ -451,7 +448,7 @@ def check_channels_if_necessary(channel_ids):
def _get_atoma_feed(channel_id):
url = f'https://www.youtube.com/feeds/videos.xml?channel_id={channel_id}'
url = 'https://www.youtube.com/feeds/videos.xml?channel_id=' + channel_id
try:
return util.fetch_url(url).decode('utf-8')
except util.FetchError as e:
@@ -485,15 +482,16 @@ def _get_channel_videos_first_page(channel_id, channel_status_name):
return channel_info
except util.FetchError as e:
if e.code == '429' and settings.route_tor:
error_message = (f'Error checking channel {channel_status_name}: '
f'YouTube blocked the request because the Tor exit node is overutilized. '
f'Try getting a new exit node by using the New Identity button in the Tor Browser.')
error_message = ('Error checking channel ' + channel_status_name
+ ': YouTube blocked the request because the'
+ ' Tor exit node is overutilized. Try getting a new exit node'
+ ' by using the New Identity button in the Tor Browser.')
if e.ip:
error_message += f' Exit node IP address: {e.ip}'
error_message += ' Exit node IP address: ' + e.ip
print(error_message)
return None
elif e.code == '502':
print(f'Error checking channel {channel_status_name}: {e}')
print('Error checking channel', channel_status_name + ':', str(e))
return None
raise
@@ -504,7 +502,7 @@ def _get_upstream_videos(channel_id):
except KeyError:
channel_status_name = channel_id
print(f"Checking channel: {channel_status_name}")
print("Checking channel: " + channel_status_name)
tasks = (
# channel page, need for video duration
@@ -533,8 +531,7 @@ def _get_upstream_videos(channel_id):
return None
root = defusedxml.ElementTree.fromstring(feed)
if remove_bullshit(root.tag) != 'feed':
raise ValueError('Root element is not <feed>')
assert remove_bullshit(root.tag) == 'feed'
for entry in root:
if (remove_bullshit(entry.tag) != 'entry'):
continue
@@ -542,22 +539,22 @@ def _get_upstream_videos(channel_id):
# it's yt:videoId in the xml but the yt: is turned into a namespace which is removed by remove_bullshit
video_id_element = find_element(entry, 'videoId')
time_published_element = find_element(entry, 'published')
if video_id_element is None or time_published_element is None:
raise ValueError('Missing videoId or published element')
assert video_id_element is not None
assert time_published_element is not None
time_published = int(calendar.timegm(time.strptime(time_published_element.text, '%Y-%m-%dT%H:%M:%S+00:00')))
times_published[video_id_element.text] = time_published
except ValueError:
print(f'Failed to read atoma feed for {channel_status_name}')
except AssertionError:
print('Failed to read atoma feed for ' + channel_status_name)
traceback.print_exc()
except defusedxml.ElementTree.ParseError:
print(f'Failed to read atoma feed for {channel_status_name}')
print('Failed to read atoma feed for ' + channel_status_name)
if channel_info is None: # there was an error
return
if channel_info['error']:
print(f'Error checking channel {channel_status_name}: {channel_info["error"]}')
print('Error checking channel ' + channel_status_name + ': ' + channel_info['error'])
return
videos = channel_info['items']
@@ -596,10 +593,7 @@ def _get_upstream_videos(channel_id):
# Special case: none of the videos have a time published.
# In this case, make something up
if videos and videos[0]['time_published'] is None:
# Invariant: if the first video has no timestamp, earlier passes
# ensure all of them are unset. Don't rely on `assert`.
if not all(v['time_published'] is None for v in videos):
raise RuntimeError('Inconsistent time_published state')
assert all(v['time_published'] is None for v in videos)
now = time.time()
for i in range(len(videos)):
# 1 month between videos
@@ -814,8 +808,7 @@ def import_subscriptions():
file = file.read().decode('utf-8')
try:
root = defusedxml.ElementTree.fromstring(file)
if root.tag != 'opml':
raise ValueError('Root element is not <opml>')
assert root.tag == 'opml'
channels = []
for outline_element in root[0][0]:
if (outline_element.tag != 'outline') or ('xmlUrl' not in outline_element.attrib):
@@ -826,7 +819,7 @@ def import_subscriptions():
channel_id = channel_rss_url[channel_rss_url.find('channel_id=')+11:].strip()
channels.append((channel_id, channel_name))
except (ValueError, IndexError, defusedxml.ElementTree.ParseError):
except (AssertionError, IndexError, defusedxml.ElementTree.ParseError) as e:
return '400 Bad Request: Unable to read opml xml file, or the file is not the expected format', 400
elif mime_type in ('text/csv', 'application/vnd.ms-excel'):
content = file.read().decode('utf-8')
@@ -1022,7 +1015,7 @@ def get_subscriptions_page():
tag = request.args.get('tag', None)
videos, number_of_videos_in_db = _get_videos(cursor, 60, (page - 1)*60, tag)
for video in videos:
video['thumbnail'] = f'{util.URL_ORIGIN}/data/subscription_thumbnails/{video["id"]}.jpg'
video['thumbnail'] = util.URL_ORIGIN + '/data/subscription_thumbnails/' + video['id'] + '.jpg'
video['type'] = 'video'
video['item_size'] = 'small'
util.add_extra_html_info(video)
@@ -1032,7 +1025,7 @@ def get_subscriptions_page():
subscription_list = []
for channel_name, channel_id, muted in _get_subscribed_channels(cursor):
subscription_list.append({
'channel_url': f'{util.URL_ORIGIN}/channel/{channel_id}',
'channel_url': util.URL_ORIGIN + '/channel/' + channel_id,
'channel_name': channel_name,
'channel_id': channel_id,
'muted': muted,
@@ -1078,20 +1071,11 @@ def post_subscriptions_page():
return '', 204
# YouTube video IDs are exactly 11 chars from [A-Za-z0-9_-]. Enforce this
# before using the value in filesystem paths to prevent path traversal
# (CWE-22, OWASP A01:2021).
_VIDEO_ID_RE = re.compile(r'^[A-Za-z0-9_-]{11}$')
@yt_app.route('/data/subscription_thumbnails/<thumbnail>')
def serve_subscription_thumbnail(thumbnail):
'''Serves thumbnail from disk if it's been saved already. If not, downloads the thumbnail, saves to disk, and serves it.'''
if not thumbnail.endswith('.jpg'):
flask.abort(400)
assert thumbnail[-4:] == '.jpg'
video_id = thumbnail[0:-4]
if not _VIDEO_ID_RE.match(video_id):
flask.abort(400)
thumbnail_path = os.path.join(thumbnails_directory, thumbnail)
if video_id in existing_thumbnails:
@@ -1108,17 +1092,17 @@ def serve_subscription_thumbnail(thumbnail):
for quality in ('hq720.jpg', 'sddefault.jpg', 'hqdefault.jpg'):
url = f"https://i.ytimg.com/vi/{video_id}/{quality}"
try:
image = util.fetch_url(url, report_text=f"Saved thumbnail: {video_id}")
image = util.fetch_url(url, report_text="Saved thumbnail: " + video_id)
break
except util.FetchError as e:
if '404' in str(e):
continue
print(f"Failed to download thumbnail for {video_id}: {e}")
print("Failed to download thumbnail for " + video_id + ": " + str(e))
flask.abort(500)
except urllib.error.HTTPError as e:
if e.code == 404:
continue
print(f"Failed to download thumbnail for {video_id}: {e}")
print("Failed to download thumbnail for " + video_id + ": " + str(e))
flask.abort(e.code)
if image is None:

View File

@@ -105,10 +105,5 @@
{% if use_dash %}
<script src="/youtube.com/static/js/av-merge.js"></script>
{% endif %}
<!-- Storyboard Preview Thumbnails (native players only; Plyr handles this internally) -->
{% if settings.use_video_player != 2 and settings.native_player_storyboard %}
<script src="/youtube.com/static/js/storyboard-preview.js"></script>
{% endif %}
</body>
</html>

View File

@@ -102,40 +102,22 @@
{% if settings.use_video_download != 0 %}
<details class="v-download">
<summary class="download-dropdown-label">{{ _('Download') }}</summary>
<div class="download-table-container">
<table class="download-table" aria-label="Download formats">
<thead>
<tr>
<th scope="col">{{ _('Ext') }}</th>
<th scope="col">{{ _('Video') }}</th>
<th scope="col">{{ _('Audio') }}</th>
<th scope="col">{{ _('Size') }}</th>
<th scope="col">{{ _('Codecs') }}</th>
<th scope="col">{{ _('Link') }}</th>
</tr>
</thead>
<tbody>
{% for format in download_formats %}
<tr>
<td data-label="{{ _('Ext') }}">{{ format['ext'] }}</td>
<td data-label="{{ _('Video') }}">{{ format['video_quality'] }}</td>
<td data-label="{{ _('Audio') }}">{{ format['audio_quality'] }}</td>
<td data-label="{{ _('Size') }}">{{ format['file_size'] }}</td>
<td data-label="{{ _('Codecs') }}">{{ format['codecs'] }}</td>
<td data-label="{{ _('Link') }}"><a class="download-link" href="{{ format['url'] }}" download="{{ title }}.{{ format['ext'] }}" aria-label="{{ _('Download') }} {{ format['ext'] }} {{ format['video_quality'] }} {{ format['audio_quality'] }}">{{ _('Download') }}</a></td>
</tr>
{% endfor %}
{% for download in other_downloads %}
<tr>
<td data-label="{{ _('Ext') }}">{{ download['ext'] }}</td>
<td data-label="{{ _('Video') }}" colspan="3">{{ download['label'] }}</td>
<td data-label="{{ _('Codecs') }}">{{ download.get('codecs', 'N/A') }}</td>
<td data-label="{{ _('Link') }}"><a class="download-link" href="{{ download['url'] }}" download aria-label="{{ _('Download') }} {{ download['label'] }}">{{ _('Download') }}</a></td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<ul class="download-dropdown-content">
{% for format in download_formats %}
<li class="download-format">
<a class="download-link" href="{{ format['url'] }}" download="{{ title }}.{{ format['ext'] }}">
{{ format['ext'] }} {{ format['video_quality'] }} {{ format['audio_quality'] }} {{ format['file_size'] }} {{ format['codecs'] }}
</a>
</li>
{% endfor %}
{% for download in other_downloads %}
<li class="download-format">
<a href="{{ download['url'] }}" download>
{{ download['ext'] }} {{ download['label'] }}
</a>
</li>
{% endfor %}
</ul>
</details>
{% else %}
<span class="v-download"></span>
@@ -322,8 +304,8 @@
<!-- /plyr -->
{% endif %}
<!-- Storyboard Preview Thumbnails (native players only; Plyr handles this internally) -->
{% if settings.use_video_player != 2 and settings.native_player_storyboard %}
<!-- Storyboard Preview Thumbnails -->
{% if settings.use_video_player != 2 %}
<script src="/youtube.com/static/js/storyboard-preview.js"></script>
{% endif %}

View File

@@ -1,6 +1,5 @@
from datetime import datetime
import logging
import random
import settings
import socks
import sockshandler
@@ -20,10 +19,10 @@ import gevent.queue
import gevent.lock
import collections
import stem
import stem.control
import traceback
logger = logging.getLogger(__name__)
import stem.control
import traceback
# The trouble with the requests library: It ships its own certificate bundle via certifi
# instead of using the system certificate store, meaning self-signed certificates
@@ -55,8 +54,8 @@ logger = logging.getLogger(__name__)
# https://github.com/kennethreitz/requests/issues/2966
# Until then, I will use a mix of urllib3 and urllib.
import urllib3 # noqa: E402 (imported here intentionally after the long note above)
import urllib3.contrib.socks # noqa: E402
import urllib3
import urllib3.contrib.socks
URL_ORIGIN = "/https://www.youtube.com"
@@ -72,7 +71,7 @@ class TorManager:
def __init__(self):
self.old_tor_connection_pool = None
self.tor_connection_pool = urllib3.contrib.socks.SOCKSProxyManager(
f'socks5h://127.0.0.1:{settings.tor_port}/',
'socks5h://127.0.0.1:' + str(settings.tor_port) + '/',
cert_reqs='CERT_REQUIRED')
self.tor_pool_refresh_time = time.monotonic()
settings.add_setting_changed_hook(
@@ -92,7 +91,7 @@ class TorManager:
self.old_tor_connection_pool = self.tor_connection_pool
self.tor_connection_pool = urllib3.contrib.socks.SOCKSProxyManager(
f'socks5h://127.0.0.1:{settings.tor_port}/',
'socks5h://127.0.0.1:' + str(settings.tor_port) + '/',
cert_reqs='CERT_REQUIRED')
self.tor_pool_refresh_time = time.monotonic()
@@ -178,6 +177,7 @@ def get_pool(use_tor):
class HTTPAsymmetricCookieProcessor(urllib.request.BaseHandler):
'''Separate cookiejars for receiving and sending'''
def __init__(self, cookiejar_send=None, cookiejar_receive=None):
import http.cookiejar
self.cookiejar_send = cookiejar_send
self.cookiejar_receive = cookiejar_receive
@@ -198,9 +198,9 @@ class HTTPAsymmetricCookieProcessor(urllib.request.BaseHandler):
class FetchError(Exception):
def __init__(self, code, reason='', ip=None, error_message=None):
if error_message:
string = f"{code} {reason}: {error_message}"
string = code + ' ' + reason + ': ' + error_message
else:
string = f"HTTP error during request: {code} {reason}"
string = 'HTTP error during request: ' + code + ' ' + reason
Exception.__init__(self, string)
self.code = code
self.reason = reason
@@ -208,16 +208,6 @@ class FetchError(Exception):
self.error_message = error_message
def _noop_cleanup(response):
'''No-op cleanup used when the urllib opener owns the response.'''
return None
def _release_conn_cleanup(response):
'''Release the urllib3 pooled connection back to the pool.'''
response.release_conn()
def decode_content(content, encoding_header):
encodings = encoding_header.replace(' ', '').split(',')
for encoding in reversed(encodings):
@@ -273,7 +263,7 @@ def fetch_url_response(url, headers=(), timeout=15, data=None,
opener = urllib.request.build_opener(cookie_processor)
response = opener.open(req, timeout=timeout)
cleanup_func = _noop_cleanup
cleanup_func = (lambda r: None)
else: # Use a urllib3 pool. Cookies can't be used since urllib3 doesn't have easy support for them.
# default: Retry.DEFAULT = Retry(3)
@@ -294,18 +284,20 @@ def fetch_url_response(url, headers=(), timeout=15, data=None,
exception_cause = e.__context__.__context__
if (isinstance(exception_cause, socks.ProxyConnectionError)
and settings.route_tor):
msg = f'Failed to connect to Tor. Check that Tor is open and that your internet connection is working.\n\n{e}'
msg = ('Failed to connect to Tor. Check that Tor is open and '
'that your internet connection is working.\n\n'
+ str(e))
raise FetchError('502', reason='Bad Gateway',
error_message=msg)
elif isinstance(e.__context__,
urllib3.exceptions.NewConnectionError):
msg = f'Failed to establish a connection.\n\n{e}'
msg = 'Failed to establish a connection.\n\n' + str(e)
raise FetchError(
'502', reason='Bad Gateway',
error_message=msg)
else:
raise
cleanup_func = _release_conn_cleanup
cleanup_func = (lambda r: r.release_conn())
return response, cleanup_func
@@ -323,6 +315,8 @@ def fetch_url(url, headers=(), timeout=15, report_text=None, data=None,
Max retries: 5 attempts with exponential backoff
"""
import random
max_retries = 5
base_delay = 1.0 # Base delay in seconds
@@ -389,7 +383,7 @@ def fetch_url(url, headers=(), timeout=15, report_text=None, data=None,
if error:
raise FetchError(
'429', reason=response.reason, ip=ip,
error_message=f'Automatic circuit change: {error}')
error_message='Automatic circuit change: ' + error)
continue # retry with new identity
# Check for client errors (400, 404) - don't retry these
@@ -407,7 +401,7 @@ def fetch_url(url, headers=(), timeout=15, report_text=None, data=None,
logger.error(f'Server error {response.status} after {max_retries} retries')
raise FetchError(str(response.status), reason=response.reason, ip=None)
# Exponential backoff for server errors. Non-crypto jitter.
# Exponential backoff for server errors
delay = (base_delay * (2 ** attempt)) + random.uniform(0, 1)
logger.warning(f'Server error ({response.status}). Waiting {delay:.1f}s before retry {attempt + 1}/{max_retries}...')
time.sleep(delay)
@@ -438,7 +432,7 @@ def fetch_url(url, headers=(), timeout=15, report_text=None, data=None,
else:
raise
# Wait and retry. Non-crypto jitter.
# Wait and retry
delay = (base_delay * (2 ** attempt)) + random.uniform(0, 1)
logger.warning(f'Connection error. Waiting {delay:.1f}s before retry {attempt + 1}/{max_retries}...')
time.sleep(delay)
@@ -465,7 +459,10 @@ def head(url, use_tor=False, report_text=None, max_redirects=10):
headers = {'User-Agent': 'Python-urllib'}
response = pool.request('HEAD', url, headers=headers, retries=retries)
if report_text:
print(f'{report_text} Latency: {round(time.monotonic() - start_time, 3)}')
print(
report_text,
' Latency:',
round(time.monotonic() - start_time, 3))
return response
mobile_user_agent = 'Mozilla/5.0 (Linux; Android 7.0; Redmi Note 4 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Mobile Safari/537.36'
@@ -535,30 +532,30 @@ class RateLimitedQueue(gevent.queue.Queue):
def download_thumbnail(save_directory, video_id):
save_location = os.path.join(save_directory, video_id + '.jpg')
save_location = os.path.join(save_directory, video_id + ".jpg")
for quality in ('hq720.jpg', 'sddefault.jpg', 'hqdefault.jpg'):
url = f'https://i.ytimg.com/vi/{video_id}/{quality}'
url = f"https://i.ytimg.com/vi/{video_id}/{quality}"
try:
thumbnail = fetch_url(url, report_text=f'Saved thumbnail: {video_id}')
thumbnail = fetch_url(url, report_text="Saved thumbnail: " + video_id)
except FetchError as e:
if '404' in str(e):
continue
print(f'Failed to download thumbnail for {video_id}: {e}')
print("Failed to download thumbnail for " + video_id + ": " + str(e))
return False
except urllib.error.HTTPError as e:
if e.code == 404:
continue
print(f'Failed to download thumbnail for {video_id}: {e}')
print("Failed to download thumbnail for " + video_id + ": " + str(e))
return False
try:
with open(save_location, 'wb') as f:
f.write(thumbnail)
f = open(save_location, 'wb')
except FileNotFoundError:
os.makedirs(save_directory, exist_ok=True)
with open(save_location, 'wb') as f:
f.write(thumbnail)
f = open(save_location, 'wb')
f.write(thumbnail)
f.close()
return True
print(f'No thumbnail available for {video_id}')
print("No thumbnail available for " + video_id)
return False
@@ -693,7 +690,7 @@ def prefix_urls(item):
def add_extra_html_info(item):
if item['type'] == 'video':
item['url'] = f'{URL_ORIGIN}/watch?v={item["id"]}' if item.get('id') else None
item['url'] = (URL_ORIGIN + '/watch?v=' + item['id']) if item.get('id') else None
video_info = {}
for key in ('id', 'title', 'author', 'duration', 'author_id'):
@@ -716,7 +713,7 @@ def add_extra_html_info(item):
item['url'] = concat_or_none(URL_ORIGIN, "/channel/", item['id'])
if item.get('author_id') and 'author_url' not in item:
item['author_url'] = f'{URL_ORIGIN}/channel/{item["author_id"]}'
item['author_url'] = URL_ORIGIN + '/channel/' + item['author_id']
def check_gevent_exceptions(*tasks):
@@ -962,7 +959,7 @@ def call_youtube_api(client, api, data):
user_agent = context['client'].get('userAgent') or mobile_user_agent
visitor_data = get_visitor_data()
url = f'https://{host}/youtubei/v1/{api}?key={key}'
url = 'https://' + host + '/youtubei/v1/' + api + '?key=' + key
if visitor_data:
context['client'].update({'visitorData': visitor_data})
data['context'] = context
@@ -973,8 +970,8 @@ def call_youtube_api(client, api, data):
headers = ( *headers, ('X-Goog-Visitor-Id', visitor_data ))
response = fetch_url(
url, data=data, headers=headers,
debug_name=f'youtubei_{api}_{client}',
report_text=f'Fetched {client} youtubei {api}'
debug_name='youtubei_' + api + '_' + client,
report_text='Fetched ' + client + ' youtubei ' + api
).decode('utf-8')
return response

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = 'v0.5.0'
__version__ = 'v0.4.5'

View File

@@ -1,26 +1,27 @@
import json
import logging
import math
import os
import re
import traceback
import urllib
from math import ceil
from types import SimpleNamespace
from urllib.parse import parse_qs, urlencode
import flask
import gevent
import urllib3.exceptions
from flask import request
import youtube
from youtube import yt_app
from youtube import util, comments, local_playlist, yt_data_extract
from youtube.util import time_utc_isoformat
import settings
from flask import request
import flask
import logging
logger = logging.getLogger(__name__)
import json
import gevent
import os
import math
import traceback
import urllib
import re
import urllib3.exceptions
from urllib.parse import parse_qs, urlencode
from types import SimpleNamespace
from math import ceil
try:
with open(os.path.join(settings.data_dir, 'decrypt_function_cache.json'), 'r') as f:
@@ -53,7 +54,7 @@ def get_video_sources(info, target_resolution):
if fmt['acodec'] and fmt['vcodec']:
if fmt.get('audio_track_is_default', True) is False:
continue
source = {'type': f"video/{fmt['ext']}",
source = {'type': 'video/' + fmt['ext'],
'quality_string': short_video_quality_string(fmt)}
source['quality_string'] += ' (integrated)'
source.update(fmt)
@@ -61,19 +62,17 @@ def get_video_sources(info, target_resolution):
continue
if not (fmt['init_range'] and fmt['index_range']):
# Allow HLS-backed audio tracks (served locally, no init/index needed)
url_value = fmt.get('url', '')
if (not url_value.startswith('http://127.')
and '/ytl-api/' not in url_value):
if not fmt.get('url', '').startswith('http://127.') and not '/ytl-api/' in fmt.get('url', ''):
continue
# Mark as HLS for frontend
fmt['is_hls'] = True
if fmt['acodec'] and not fmt['vcodec'] and (fmt['audio_bitrate'] or fmt['bitrate']):
if fmt['bitrate']:
fmt['audio_bitrate'] = int(fmt['bitrate']/1000)
source = {'type': f"audio/{fmt['ext']}",
source = {'type': 'audio/' + fmt['ext'],
'quality_string': audio_quality_string(fmt)}
source.update(fmt)
source['mime_codec'] = f"{source['type']}; codecs=\"{source['acodec']}\""
source['mime_codec'] = source['type'] + '; codecs="' + source['acodec'] + '"'
tid = fmt.get('audio_track_id') or 'default'
if tid not in audio_by_track:
audio_by_track[tid] = {
@@ -85,11 +84,11 @@ def get_video_sources(info, target_resolution):
elif all(fmt[attr] for attr in ('vcodec', 'quality', 'width', 'fps', 'file_size')):
if codec_name(fmt['vcodec']) == 'unknown':
continue
source = {'type': f"video/{fmt['ext']}",
source = {'type': 'video/' + fmt['ext'],
'quality_string': short_video_quality_string(fmt)}
source.update(fmt)
source['mime_codec'] = f"{source['type']}; codecs=\"{source['vcodec']}\""
quality = f"{fmt['quality']}p{fmt['fps']}"
source['mime_codec'] = source['type'] + '; codecs="' + source['vcodec'] + '"'
quality = str(fmt['quality']) + 'p' + str(fmt['fps'])
video_only_sources.setdefault(quality, []).append(source)
audio_tracks = []
@@ -141,7 +140,7 @@ def get_video_sources(info, target_resolution):
def video_rank(src):
''' Sort by settings preference. Use file size as tiebreaker '''
setting_name = f'codec_rank_{codec_name(src["vcodec"])}'
setting_name = 'codec_rank_' + codec_name(src['vcodec'])
return (settings.current_settings_dict[setting_name],
src['file_size'])
pair_info['videos'].sort(key=video_rank)
@@ -183,7 +182,7 @@ def make_caption_src(info, lang, auto=False, trans_lang=None):
if auto:
label += ' (Automatic)'
if trans_lang:
label += f' -> {trans_lang}'
label += ' -> ' + trans_lang
# Try to use Android caption URL directly (no PO Token needed)
caption_url = None
@@ -204,7 +203,7 @@ def make_caption_src(info, lang, auto=False, trans_lang=None):
else:
caption_url += '&fmt=vtt'
if trans_lang:
caption_url += f'&tlang={trans_lang}'
caption_url += '&tlang=' + trans_lang
url = util.prefix_url(caption_url)
else:
# Fallback to old method
@@ -223,7 +222,7 @@ def lang_in(lang, sequence):
if lang is None:
return False
lang = lang[0:2]
return lang in (item[0:2] for item in sequence)
return lang in (l[0:2] for l in sequence)
def lang_eq(lang1, lang2):
@@ -239,9 +238,9 @@ def equiv_lang_in(lang, sequence):
e.g. if lang is en, extracts en-GB from sequence.
Necessary because if only a specific variant like en-GB is available, can't ask YouTube for simply en. Need to get the available variant.'''
lang = lang[0:2]
for item in sequence:
if item[0:2] == lang:
return item
for l in sequence:
if l[0:2] == lang:
return l
return None
@@ -311,15 +310,7 @@ def get_subtitle_sources(info):
sources[-1]['on'] = True
if len(sources) == 0:
# Invariant: with no caption sources there should be no languages
# either. Don't rely on `assert` which is stripped under `python -O`.
if (len(info['automatic_caption_languages']) != 0
or len(info['manual_caption_languages']) != 0):
logger.warning(
'Unexpected state: no subtitle sources but %d auto / %d manual languages',
len(info['automatic_caption_languages']),
len(info['manual_caption_languages']),
)
assert len(info['automatic_caption_languages']) == 0 and len(info['manual_caption_languages']) == 0
return sources
@@ -357,10 +348,10 @@ def decrypt_signatures(info, video_id):
player_name = info['player_name']
if player_name in decrypt_cache:
print(f'Using cached decryption function for: {player_name}')
print('Using cached decryption function for: ' + player_name)
info['decryption_function'] = decrypt_cache[player_name]
else:
base_js = util.fetch_url(info['base_js'], debug_name='base.js', report_text=f'Fetched player {player_name}')
base_js = util.fetch_url(info['base_js'], debug_name='base.js', report_text='Fetched player ' + player_name)
base_js = base_js.decode('utf-8')
err = yt_data_extract.extract_decryption_function(info, base_js)
if err:
@@ -387,11 +378,11 @@ def fetch_player_response(client, video_id):
def fetch_watch_page_info(video_id, playlist_id, index):
# bpctr=9999999999 will bypass are-you-sure dialogs for controversial
# videos
url = f'https://m.youtube.com/embed/{video_id}?bpctr=9999999999'
url = 'https://m.youtube.com/embed/' + video_id + '?bpctr=9999999999'
if playlist_id:
url += f'&list={playlist_id}'
url += '&list=' + playlist_id
if index:
url += f'&index={index}'
url += '&index=' + index
headers = (
('Accept', '*/*'),
@@ -493,7 +484,7 @@ def extract_info(video_id, use_invidious, playlist_id=None, index=None):
# Register HLS audio tracks for proxy access
added = 0
for lang, track in info['hls_audio_tracks'].items():
ck = f"{video_id}_{lang}"
ck = video_id + '_' + lang
from youtube.hls_cache import register_track
register_track(ck, track['hls_url'],
video_id=video_id, track_id=lang)
@@ -502,7 +493,7 @@ def extract_info(video_id, use_invidious, playlist_id=None, index=None):
'audio_track_id': lang,
'audio_track_name': track['name'],
'audio_track_is_default': track['is_default'],
'itag': f'hls_{lang}',
'itag': 'hls_' + lang,
'ext': 'mp4',
'audio_bitrate': 128,
'bitrate': 128000,
@@ -516,7 +507,7 @@ def extract_info(video_id, use_invidious, playlist_id=None, index=None):
'fps': None,
'init_range': {'start': 0, 'end': 0},
'index_range': {'start': 0, 'end': 0},
'url': f'/ytl-api/audio-track?id={urllib.parse.quote(ck)}',
'url': '/ytl-api/audio-track?id=' + urllib.parse.quote(ck),
's': None,
'sp': None,
'quality': None,
@@ -538,11 +529,11 @@ def extract_info(video_id, use_invidious, playlist_id=None, index=None):
# Register HLS manifest for proxying
if info['hls_manifest_url']:
ck = f"{video_id}_video"
ck = video_id + '_video'
from youtube.hls_cache import register_track
register_track(ck, info['hls_manifest_url'], video_id=video_id, track_id='video')
# Use proxy URL instead of direct Google Video URL
info['hls_manifest_url'] = f'/ytl-api/hls-manifest?id={urllib.parse.quote(ck)}'
info['hls_manifest_url'] = '/ytl-api/hls-manifest?id=' + urllib.parse.quote(ck)
# Fallback to 'ios' if no valid URLs are found
if not info.get('formats') or info.get('player_urls_missing'):
@@ -566,7 +557,7 @@ def extract_info(video_id, use_invidious, playlist_id=None, index=None):
if info.get('formats'):
decryption_error = decrypt_signatures(info, video_id)
if decryption_error:
info['playability_error'] = f'Error decrypting url signatures: {decryption_error}'
info['playability_error'] = 'Error decrypting url signatures: ' + decryption_error
# check if urls ready (non-live format) in former livestream
# urls not ready if all of them have no filesize
@@ -623,9 +614,9 @@ def extract_info(video_id, use_invidious, playlist_id=None, index=None):
def video_quality_string(format):
if format['vcodec']:
result = f"{format['width'] or '?'}x{format['height'] or '?'}"
result = str(format['width'] or '?') + 'x' + str(format['height'] or '?')
if format['fps']:
result += f" {format['fps']}fps"
result += ' ' + str(format['fps']) + 'fps'
return result
elif format['acodec']:
return 'audio only'
@@ -634,7 +625,7 @@ def video_quality_string(format):
def short_video_quality_string(fmt):
result = f"{fmt['quality'] or '?'}p"
result = str(fmt['quality'] or '?') + 'p'
if fmt['fps']:
result += str(fmt['fps'])
if fmt['vcodec'].startswith('av01'):
@@ -642,18 +633,18 @@ def short_video_quality_string(fmt):
elif fmt['vcodec'].startswith('avc'):
result += ' h264'
else:
result += f" {fmt['vcodec']}"
result += ' ' + fmt['vcodec']
return result
def audio_quality_string(fmt):
if fmt['acodec']:
if fmt['audio_bitrate']:
result = f"{fmt['audio_bitrate']}k"
result = '%d' % fmt['audio_bitrate'] + 'k'
else:
result = '?k'
if fmt['audio_sample_rate']:
result += f" {'%.3G' % (fmt['audio_sample_rate']/1000)}kHz"
result += ' ' + '%.3G' % (fmt['audio_sample_rate']/1000) + 'kHz'
return result
elif fmt['vcodec']:
return 'video only'
@@ -678,6 +669,7 @@ def format_bytes(bytes):
@yt_app.route('/ytl-api/audio-track-proxy')
def audio_track_proxy():
"""Proxy for DASH audio tracks to avoid throttling."""
cache_key = request.args.get('id', '')
audio_url = request.args.get('url', '')
if not audio_url:
@@ -700,7 +692,7 @@ def audio_track_proxy():
@yt_app.route('/ytl-api/audio-track')
def get_audio_track():
"""Proxy HLS audio/video: playlist or individual segment."""
from youtube.hls_cache import get_hls_url
from youtube.hls_cache import get_hls_url, _tracks
cache_key = request.args.get('id', '')
seg_url = request.args.get('seg', '')
@@ -737,9 +729,9 @@ def get_audio_track():
seg = line if line.startswith('http') else urljoin(playlist_base, line)
# Always use &seg= parameter, never &url= for segments
playlist_lines.append(
f'{base_url}/ytl-api/audio-track?id='
f'{urllib.parse.quote(cache_key)}'
f'&seg={urllib.parse.quote(seg, safe="")}'
base_url + '/ytl-api/audio-track?id='
+ urllib.parse.quote(cache_key)
+ '&seg=' + urllib.parse.quote(seg, safe='')
)
playlist = '\n'.join(playlist_lines)
@@ -797,7 +789,9 @@ def get_audio_track():
return url
if not url.startswith('http://') and not url.startswith('https://'):
url = urljoin(playlist_base, url)
return f'{base_url}/ytl-api/audio-track?id={urllib.parse.quote(cache_key)}&seg={urllib.parse.quote(url, safe="")}'
return (base_url + '/ytl-api/audio-track?id='
+ urllib.parse.quote(cache_key)
+ '&seg=' + urllib.parse.quote(url, safe=''))
playlist_lines = []
for line in playlist.split('\n'):
@@ -810,7 +804,7 @@ def get_audio_track():
if line.startswith('#') and 'URI=' in line:
def rewrite_uri_attr(match):
uri = match.group(1)
return f'URI="{proxy_url(uri)}"'
return 'URI="' + proxy_url(uri) + '"'
line = _re.sub(r'URI="([^"]+)"', rewrite_uri_attr, line)
playlist_lines.append(line)
elif line.startswith('#'):
@@ -881,7 +875,9 @@ def get_audio_track():
if segment_url.startswith('/ytl-api/audio-track'):
return segment_url
base_url = request.url_root.rstrip('/')
return f'{base_url}/ytl-api/audio-track?id={urllib.parse.quote(cache_key)}&seg={urllib.parse.quote(segment_url)}'
return (base_url + '/ytl-api/audio-track?id='
+ urllib.parse.quote(cache_key)
+ '&seg=' + urllib.parse.quote(segment_url))
playlist_lines = []
for line in playlist.split('\n'):
@@ -920,7 +916,7 @@ def get_hls_manifest():
flask.abort(404, 'HLS manifest not found')
try:
print('[hls-manifest] Fetching HLS manifest...')
print(f'[hls-manifest] Fetching HLS manifest...')
manifest = util.fetch_url(hls_url,
headers=(('User-Agent', 'Mozilla/5.0'),),
debug_name='hls_manifest').decode('utf-8')
@@ -945,10 +941,14 @@ def get_hls_manifest():
if is_audio_track:
# Audio track playlist - proxy through audio-track endpoint
return f'{base_url}/ytl-api/audio-track?id={urllib.parse.quote(cache_key)}&url={urllib.parse.quote(url, safe="")}'
return (base_url + '/ytl-api/audio-track?id='
+ urllib.parse.quote(cache_key)
+ '&url=' + urllib.parse.quote(url, safe=''))
else:
# Video segment or variant playlist - proxy through audio-track endpoint
return f'{base_url}/ytl-api/audio-track?id={urllib.parse.quote(cache_key)}&seg={urllib.parse.quote(url, safe="")}'
return (base_url + '/ytl-api/audio-track?id='
+ urllib.parse.quote(cache_key)
+ '&seg=' + urllib.parse.quote(url, safe=''))
# Parse and rewrite the manifest
manifest_lines = []
@@ -966,7 +966,7 @@ def get_hls_manifest():
nonlocal rewritten_count
uri = match.group(1)
rewritten_count += 1
return f'URI="{rewrite_url(uri, is_audio_track=True)}"'
return 'URI="' + rewrite_url(uri, is_audio_track=True) + '"'
line = _re.sub(r'URI="([^"]+)"', rewrite_media_uri, line)
manifest_lines.append(line)
elif line.startswith('#'):
@@ -1018,8 +1018,7 @@ def get_storyboard_vtt():
for i, board in enumerate(boards):
*t, _, sigh = board.split("#")
width, height, count, width_cnt, height_cnt, interval = map(int, t)
if height != wanted_height:
continue
if height != wanted_height: continue
q['sigh'] = [sigh]
url = f"{base_url}?{urlencode(q, doseq=True)}"
storyboard = SimpleNamespace(
@@ -1045,7 +1044,7 @@ def get_storyboard_vtt():
ts = 0 # current timestamp
for i in range(storyboard.storyboard_count):
url = f'/{storyboard.url.replace("$M", str(i))}'
url = '/' + storyboard.url.replace("$M", str(i))
interval = storyboard.interval
w, h = storyboard.width, storyboard.height
w_cnt, h_cnt = storyboard.width_cnt, storyboard.height_cnt
@@ -1070,7 +1069,7 @@ def get_watch_page(video_id=None):
if not video_id:
return flask.render_template('error.html', error_message='Missing video id'), 404
if len(video_id) < 11:
return flask.render_template('error.html', error_message=f'Incomplete video id (too short): {video_id}'), 404
return flask.render_template('error.html', error_message='Incomplete video id (too short): ' + video_id), 404
time_start_str = request.args.get('t', '0s')
time_start = 0
@@ -1133,9 +1132,9 @@ def get_watch_page(video_id=None):
util.prefix_urls(item)
util.add_extra_html_info(item)
if playlist_id:
item['url'] += f'&list={playlist_id}'
item['url'] += '&list=' + playlist_id
if item['index']:
item['url'] += f'&index={item["index"]}'
item['url'] += '&index=' + str(item['index'])
info['playlist']['author_url'] = util.prefix_url(
info['playlist']['author_url'])
if settings.img_prefix:
@@ -1151,16 +1150,16 @@ def get_watch_page(video_id=None):
filename = title
ext = fmt.get('ext')
if ext:
filename += f'.{ext}'
filename += '.' + ext
fmt['url'] = fmt['url'].replace(
'/videoplayback',
f'/videoplayback/name/{filename}')
'/videoplayback/name/' + filename)
download_formats = []
for format in (info['formats'] + info['hls_formats']):
if format['acodec'] and format['vcodec']:
codecs_string = f"{format['acodec']}, {format['vcodec']}"
codecs_string = format['acodec'] + ', ' + format['vcodec']
else:
codecs_string = format['acodec'] or format['vcodec'] or '?'
download_formats.append({
@@ -1183,6 +1182,7 @@ def get_watch_page(video_id=None):
uni_sources = video_sources['uni_sources']
pair_sources = video_sources['pair_sources']
pair_idx = video_sources['pair_idx']
audio_track_sources = video_sources['audio_track_sources']
# Build audio tracks list from HLS
audio_tracks = []
@@ -1239,9 +1239,12 @@ def get_watch_page(video_id=None):
for source in subtitle_sources:
best_caption_parse = urllib.parse.urlparse(
source['url'].lstrip('/'))
transcript_url = f'{util.URL_ORIGIN}/watch/transcript{best_caption_parse.path}?{best_caption_parse.query}'
transcript_url = (util.URL_ORIGIN
+ '/watch/transcript'
+ best_caption_parse.path
+ '?' + best_caption_parse.query)
other_downloads.append({
'label': f'Video Transcript: {source["label"]}',
'label': 'Video Transcript: ' + source['label'],
'ext': 'txt',
'url': transcript_url
})
@@ -1252,7 +1255,7 @@ def get_watch_page(video_id=None):
template_name = 'watch.html'
return flask.render_template(template_name,
header_playlist_names = local_playlist.get_playlist_names(),
uploader_channel_url = f'/{info["author_url"]}' if info['author_url'] else '',
uploader_channel_url = ('/' + info['author_url']) if info['author_url'] else '',
time_published = info['time_published'],
view_count = (lambda x: '{:,}'.format(x) if x is not None else "")(info.get("view_count", None)),
like_count = (lambda x: '{:,}'.format(x) if x is not None else "")(info.get("like_count", None)),
@@ -1294,10 +1297,10 @@ def get_watch_page(video_id=None):
ip_address = info['ip_address'] if settings.route_tor else None,
invidious_used = info['invidious_used'],
invidious_reload_button = info['invidious_reload_button'],
video_url = f'{util.URL_ORIGIN}/watch?v={video_id}',
video_url = util.URL_ORIGIN + '/watch?v=' + video_id,
video_id = video_id,
storyboard_url = (f'{util.URL_ORIGIN}/ytl-api/storyboard.vtt?'
f'{urlencode([("spec_url", info["storyboard_spec_url"])])}'
storyboard_url = (util.URL_ORIGIN + '/ytl-api/storyboard.vtt?' +
urlencode([('spec_url', info['storyboard_spec_url'])])
if info['storyboard_spec_url'] else None),
js_data = {
@@ -1324,7 +1327,7 @@ def get_watch_page(video_id=None):
@yt_app.route('/api/<path:dummy>')
def get_captions(dummy):
url = f'https://www.youtube.com{request.full_path}'
url = 'https://www.youtube.com' + request.full_path
try:
result = util.fetch_url(url, headers=util.mobile_ua)
result = result.replace(b"align:start position:0%", b"")
@@ -1339,9 +1342,12 @@ inner_timestamp_removal_reg = re.compile(r'<[^>]+>')
@yt_app.route('/watch/transcript/<path:caption_path>')
def get_transcript(caption_path):
try:
captions = util.fetch_url(f'https://www.youtube.com/{caption_path}?{request.environ["QUERY_STRING"]}').decode('utf-8')
captions = util.fetch_url('https://www.youtube.com/'
+ caption_path
+ '?' + request.environ['QUERY_STRING']).decode('utf-8')
except util.FetchError as e:
msg = f'Error retrieving captions: {e}\n\nThe caption url may have expired.'
msg = ('Error retrieving captions: ' + str(e) + '\n\n'
+ 'The caption url may have expired.')
print(msg)
return flask.Response(
msg,
@@ -1389,7 +1395,7 @@ def get_transcript(caption_path):
result = ''
for seg in segments:
if seg['text'] != ' ':
result += f"{seg['begin']} {seg['text']}\r\n"
result += seg['begin'] + ' ' + seg['text'] + '\r\n'
return flask.Response(result.encode('utf-8'),
mimetype='text/plain;charset=UTF-8')

View File

@@ -212,7 +212,7 @@ def extract_date(date_text):
month, day, year = parts[-3:]
month = MONTH_ABBREVIATIONS.get(month[0:3]) # slicing in case they start writing out the full month name
if month and (re.fullmatch(r'\d\d?', day) is not None) and (re.fullmatch(r'\d{4}', year) is not None):
return f'{year}-{month}-{day}'
return year + '-' + month + '-' + day
return None
def check_missing_keys(object, *key_sequences):
@@ -222,7 +222,7 @@ def check_missing_keys(object, *key_sequences):
for key in key_sequence:
_object = _object[key]
except (KeyError, IndexError, TypeError):
return f'Could not find {key}'
return 'Could not find ' + key
return None
@@ -467,7 +467,7 @@ def extract_item_info(item, additional_info={}):
['shortBylineText', 'runs', 0, 'navigationEndpoint', 'browseEndpoint', 'browseId'],
['ownerText', 'runs', 0, 'navigationEndpoint', 'browseEndpoint', 'browseId']
))
info['author_url'] = f'https://www.youtube.com/channel/{info["author_id"]}' if info['author_id'] else None
info['author_url'] = ('https://www.youtube.com/channel/' + info['author_id']) if info['author_id'] else None
info['description'] = extract_formatted_text(multi_deep_get(
item,
['descriptionText'], ['descriptionSnippet'],

View File

@@ -305,7 +305,7 @@ def extract_playlist_metadata(polymer_json):
metadata['description'] = desc
if metadata['author_id']:
metadata['author_url'] = f'https://www.youtube.com/channel/{metadata["author_id"]}'
metadata['author_url'] = 'https://www.youtube.com/channel/' + metadata['author_id']
if metadata['first_video_id'] is None:
metadata['thumbnail'] = None

View File

@@ -650,9 +650,9 @@ def _extract_playability_error(info, player_response, error_prefix=''):
)
if playability_status not in (None, 'OK'):
info['playability_error'] = f'{error_prefix}{playability_reason}'
info['playability_error'] = error_prefix + playability_reason
elif not info['playability_error']: # do not override
info['playability_error'] = f'{error_prefix}Unknown playability error'
info['playability_error'] = error_prefix + 'Unknown playability error'
SUBTITLE_FORMATS = ('srv1', 'srv2', 'srv3', 'ttml', 'vtt')
def extract_watch_info(polymer_json):
@@ -726,7 +726,7 @@ def extract_watch_info(polymer_json):
# Store the full URL from the player response (includes valid tokens)
if base_url:
normalized = normalize_url(base_url) if base_url.startswith('/') or not base_url.startswith('http') else base_url
info['_caption_track_urls'][f'{lang_code}_{"asr" if caption_track.get("kind") == "asr" else ""}'] = normalized
info['_caption_track_urls'][lang_code + ('_asr' if caption_track.get('kind') == 'asr' else '')] = normalized
lang_name = deep_get(urllib.parse.parse_qs(urllib.parse.urlparse(base_url).query), 'name', 0)
if lang_name:
info['_manual_caption_language_names'][lang_code] = lang_name
@@ -806,7 +806,7 @@ def extract_watch_info(polymer_json):
info['allowed_countries'] = mf.get('availableCountries', [])
# other stuff
info['author_url'] = f'https://www.youtube.com/channel/{info["author_id"]}' if info['author_id'] else None
info['author_url'] = 'https://www.youtube.com/channel/' + info['author_id'] if info['author_id'] else None
info['storyboard_spec_url'] = deep_get(player_response, 'storyboards', 'playerStoryboardSpecRenderer', 'spec')
return info
@@ -912,12 +912,12 @@ def get_caption_url(info, language, format, automatic=False, translation_languag
url = info['_captions_base_url']
if not url:
return None
url += f'&lang={language}'
url += f'&fmt={format}'
url += '&lang=' + language
url += '&fmt=' + format
if automatic:
url += '&kind=asr'
elif language in info['_manual_caption_language_names']:
url += f'&name={urllib.parse.quote(info["_manual_caption_language_names"][language], safe="")}'
url += '&name=' + urllib.parse.quote(info['_manual_caption_language_names'][language], safe='')
if translation_language:
url += '&tlang=' + translation_language
@@ -964,7 +964,7 @@ def extract_decryption_function(info, base_js):
return 'Could not find var_name'
var_name = var_with_operation_match.group(1)
var_body_match = re.search(rf'var {re.escape(var_name)}=\{{(.*?)\}};', base_js, flags=re.DOTALL)
var_body_match = re.search(r'var ' + re.escape(var_name) + r'=\{(.*?)\};', base_js, flags=re.DOTALL)
if var_body_match is None:
return 'Could not find var_body'
@@ -988,7 +988,7 @@ def extract_decryption_function(info, base_js):
elif op_body.startswith('var c=a[0]'):
operation_definitions[op_name] = 2
else:
return f'Unknown op_body: {op_body}'
return 'Unknown op_body: ' + op_body
decryption_function = []
for op_with_arg in function_body:
@@ -997,7 +997,7 @@ def extract_decryption_function(info, base_js):
return 'Could not parse operation with arg'
op_name = match.group(2).strip('[].')
if op_name not in operation_definitions:
return f'Unknown op_name: {op_name}'
return 'Unknown op_name: ' + str(op_name)
op_argument = match.group(3)
decryption_function.append([operation_definitions[op_name], int(op_argument)])
@@ -1028,5 +1028,5 @@ def decrypt_signatures(info):
_operation_2(a, argument)
signature = ''.join(a)
format['url'] += f'&{format["sp"]}={signature}'
format['url'] += '&' + format['sp'] + '=' + signature
return False