29 Commits

Author SHA1 Message Date
62a028968e chore: extend .gitignore with AI assistant configurations and caches
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 17s
CI / test (push) Successful in 50s
2026-04-04 15:08:13 -05:00
f7bbf3129a update ios client
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 14s
CI / test (push) Successful in 53s
2026-04-04 15:05:33 -05:00
688521f8d6 bump to v0.4.5
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 50s
2026-04-01 11:54:46 -05:00
6eb3741010 test: add unit tests for YouTube Shorts support
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 51s
18 tests covering:
- channel_ctoken_v5 protobuf token generation per tab
- shortsLockupViewModel parsing (id, title, thumbnail, type)
- View count formatting with K/M/B suffixes
- extract_items with reloadContinuationItemsCommand response format

All tests run offline with mocked data, no network access.
2026-04-01 11:51:42 -05:00
a374f90f6e fix: add support for YouTube Shorts tab on channel pages
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 56s
- Rewrite channel_ctoken_v5 with correct protobuf field numbers per tab
  (videos=15, shorts=10, streams=14) based on Invidious source
- Replace broken pbj=1 endpoint with youtubei browse API for shorts/streams
- Add shortsLockupViewModel parser to extract video data from new YT format
- Fix channel metadata not loading (get_metadata now uses browse API)
- Fix metadata caching: skip caching when channel_name is absent
- Show actual item count instead of UU playlist count for shorts/streams
- Format view counts with spaced suffixes (7.1 K, 1.2 M, 3 B)
2026-04-01 11:43:46 -05:00
bed14713ad bump to v0.4.4
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 45s
2026-03-31 21:48:46 -05:00
06051dd127 fix: support YouTube 2024+ data formats for playlists, podcasts and channels
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 51s
- Add PODCAST content type support in lockupViewModel extraction
- Extract thumbnails and episode count from thumbnail overlay badges
- Migrate playlist page fetching from pbj=1 to innertube API (youtubei/v1/browse)
- Support new pageHeaderRenderer format in playlist metadata extraction
- Fix subscriber count extraction when YouTube returns handle instead of count
- Hide "None subscribers" in template when data is unavailable
2026-03-31 21:38:51 -05:00
7c64630be1 update .gitignore
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 12s
CI / test (push) Successful in 52s
2026-03-28 21:49:26 -05:00
1aa344c7b0 bump to v0.4.3
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 46s
2026-03-28 16:09:23 -05:00
fa7273b328 fix: race condition in os.makedirs causing worker crashes
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 47s
Replace check-then-create pattern with exist_ok=True to prevent
FileExistsError when multiple workers initialize simultaneously.

Affects:
- subscriptions.py: open_database()
- watch.py: save_decrypt_cache()
- local_playlist.py: add_to_playlist()
- util.py: fetch_url(), get_visitor_data()
- settings.py: initialization

Fixes Gunicorn worker startup failures in multi-worker deployments.
2026-03-28 16:06:47 -05:00
a0d10e6a00 docs: remove duplicate FreeTube entry in README
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 44s
2026-03-27 21:29:46 -05:00
a46cfda029 bump to v0.4.2
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 12s
CI / test (push) Successful in 46s
2026-03-27 21:26:08 -05:00
e03f40d728 fix error handling, null URLs in templates, and Radio playlist support
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 49s
- Global error handler: friendly messages for 429, 502, 403, 400
  instead of raw tracebacks. Filter FetchError from Flask logger.
- Fix None URLs in templates: protect href/src in common_elements,
  playlist, watch, and comments templates against None values.
- Radio playlists (RD...): redirect /playlist?list=RD... to
  /watch?v=...&list=RD... since YouTube only supports them in player.
- Wrap player client fallbacks (ios, tv_embedded) in try/catch so
  a failed fallback doesn't crash the whole page.
2026-03-27 21:23:03 -05:00
22c72aa842 remove yt-dlp, fix captions PO Token issue, fix 429 retry logic
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 52s
- Remove yt-dlp entirely (modules, routes, settings, dependency)
  Was blocking page loads by running synchronously in gevent
- Fix captions: use Android client caption URLs (no PO Token needed)
  instead of web timedtext URLs that YouTube now blocks
- Fix 429 retry: fail immediately without Tor (same IP = pointless retry)
  Was causing ~27s delays with exponential backoff
- Accept ytdlp_enabled as legacy setting to avoid warning on startup
2026-03-27 20:47:44 -05:00
56ecd6cb1b fix: use YouTube-provided thumbnail URLs instead of hardcoded hq720.jpg
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 15s
CI / test (push) Successful in 58s
Videos without hq720.jpg thumbnails caused mass 404 errors.
Now preserves the actual thumbnail URL from YouTube's API response,
falls back to hqdefault.jpg only when no thumbnail is provided.
Also picks highest quality thumbnail from API (thumbnails[-1])
and adds progressive fallback for subscription/download functions.
2026-03-27 19:22:12 -05:00
f629565e77 bump to v0.4.1
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 48s
2026-03-22 21:27:50 -05:00
1f8c13adff feat: improve 429 handling with Tor support and clean CI
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 11s
CI / test (push) Successful in 50s
- Retry with new Tor identity on 429
- Improve error logging
- Remove .build.yml and .drone.yml
2026-03-22 21:25:57 -05:00
6a68f06645 Release v0.4.0 - HD Thumbnails, YouTube 2024+ Support, and yt-dlp Integration
Some checks failed
CI / test (push) Failing after 1m19s
Major Features:
- HD video thumbnails (hq720.jpg) with automatic fallback to lower qualities
- HD channel avatars (240x240 instead of 88x88)
- YouTube 2024+ lockupViewModel support for channel playlists
- youtubei/v1/browse API integration for channel playlist tabs
- yt-dlp integration for multi-language audio and subtitles

Bug Fixes:
- Fixed undefined `abort` import in playlist.py
- Fixed undefined functions in proto.py (encode_varint, bytes_to_hex, succinct_encode)
- Fixed missing `traceback` import in proto_debug.py
- Fixed blurry playlist thumbnails using default.jpg instead of HD versions
- Fixed channel playlists page using deprecated pbj=1 format

Improvements:
- Automatic thumbnail fallback system (hq720 → sddefault → hqdefault → mqdefault → default)
- JavaScript thumbnail_fallback() handler for 404 errors
- Better thumbnail quality across all pages (watch, channel, playlist, subscriptions)
- Consistent HD avatar display for all channel items
- Settings system automatically adds new settings without breaking user config

Files Modified:
- youtube/watch.py - HD thumbnails for related videos and playlist items
- youtube/channel.py - HD thumbnails for channel playlists, youtubei API integration
- youtube/playlist.py - HD thumbnails, fixed abort import
- youtube/util.py - HD thumbnail URLs, avatar HD upgrade, prefix_url improvements
- youtube/comments.py - HD video thumbnail
- youtube/subscriptions.py - HD thumbnails, fixed abort import
- youtube/yt_data_extract/common.py - lockupViewModel support, extract_lockup_view_model_info()
- youtube/yt_data_extract/everything_else.py - HD playlist thumbnails
- youtube/proto.py - Fixed undefined function references
- youtube/proto_debug.py - Added traceback import
- youtube/static/js/common.js - thumbnail_fallback() handler
- youtube/templates/*.html - Added onerror handlers for thumbnail fallback
- youtube/version.py - Bump to v0.4.0

Technical Details:
- All thumbnail URLs now use hq720.jpg (1280x720) when available
- Fallback handled client-side via JavaScript onerror handler
- Server-side avatar upgrade via regex in util.prefix_url()
- lockupViewModel parser extracts contentType, metadata, and first_video_id
- Channel playlist tabs now use youtubei/v1/browse instead of deprecated pbj=1
- Settings version system ensures backward compatibility
2026-03-22 20:50:03 -05:00
84e1acaab8 yt-dlp 2026-03-22 14:17:23 -05:00
Jesus
ed4b05d9b6 Bump version to v0.3.2 2025-03-08 16:41:58 -05:00
Jesus
6f88b1cec6 Refactor extract_info in watch.py to improve client flexibility
Introduce primary_client, fallback_client, and last_resort_client variables for better configurability.
Replace hardcoded 'android_vr' with primary_client in fetch_player_response call.
2025-03-08 16:40:51 -05:00
Jesus
03451fb8ae fix: prevent error when closing avMerge if not a function 2025-03-08 16:39:37 -05:00
Jesus
e45c3fd48b Add styles error in player 2025-03-08 16:38:31 -05:00
Jesus
1153ac8f24 Fix NoneType inside comments.py
Bug:

Traceback (most recent call last):
  File "/home/rusian/yt-local/youtube/comments.py", line 180, in video_comments
    post_process_comments_info(comments_info)
  File "/home/rusian/yt-local/youtube/comments.py", line 81, in post_process_comments_info
    comment['author'] = strip_non_ascii(comment['author'])
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rusian/yt-local/youtube/util.py", line 843, in strip_non_ascii
    stripped = (c for c in string if 0 < ord(c) < 127)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not iterable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "src/gevent/greenlet.py", line 900, in gevent._gevent_cgreenlet.Greenlet.run
  File "/home/rusian/yt-local/youtube/comments.py", line 195, in video_comments
    comments_info['error'] = 'YouTube blocked the request. IP address: %s' % e.ip
                                                                             ^^^^
AttributeError: 'TypeError' object has no attribute 'ip'
2025-03-08T01:25:47Z <Greenlet at 0x7f251e5279c0: video_comments('hcm55lU9knw', 0, lc='')> failed with AttributeError
2025-03-08 16:37:33 -05:00
Jesus
c256a045f9 Bump version to v0.3.1 2025-03-08 16:34:29 -05:00
Jesus
98603439cb Improve buffer management for different platforms
- Introduced `BUFFER_CONFIG` to define buffer sizes for various systems (webOS, Samsung Tizen, Android TV, desktop).
- Added `detectSystem()` function to determine the platform based on `navigator.userAgent`.
- Updated `Stream` constructor to use platform-specific buffer sizes dynamically.
- Added console log for debugging detected system and applied buffer size.
2025-03-08 16:32:26 -05:00
Jesus
a6ca011202 version v0.3.0 2025-03-08 16:28:39 -05:00
Jesus
114c2572a4 Renew plyr UI and simplify elements 2025-03-08 16:28:27 -05:00
f64b362603 update logic plyr-start.js 2025-03-03 08:20:41 +08:00
41 changed files with 2099 additions and 371 deletions

View File

@@ -1,12 +0,0 @@
image: debian/buster
packages:
- python3-pip
- virtualenv
tasks:
- test: |
cd yt-local
virtualenv -p python3 venv
source venv/bin/activate
python --version
pip install -r requirements-dev.txt
pytest

View File

@@ -1,10 +0,0 @@
kind: pipeline
name: default
steps:
- name: test
image: python:3.7.3
commands:
- pip install --upgrade pip
- pip install -r requirements-dev.txt
- pytest

163
.gitignore vendored
View File

@@ -1,15 +1,166 @@
# =============================================================================
# .gitignore - YT Local
# =============================================================================
# -----------------------------------------------------------------------------
# Python / Bytecode
# -----------------------------------------------------------------------------
__pycache__/
*.py[cod]
*$py.class
debug/
*.so
.Python
# -----------------------------------------------------------------------------
# Virtual Environments
# -----------------------------------------------------------------------------
.env
.env.*
!.env.example
.venv/
venv/
ENV/
env/
*.egg-info/
.eggs/
# -----------------------------------------------------------------------------
# IDE / Editors
# -----------------------------------------------------------------------------
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store
.flycheck_*
*.sublime-project
*.sublime-workspace
# -----------------------------------------------------------------------------
# Distribution / Packaging
# -----------------------------------------------------------------------------
build/
dist/
*.egg
*.manifest
*.spec
pip-wheel-metadata/
share/python-wheels/
MANIFEST
# -----------------------------------------------------------------------------
# Testing / Coverage
# -----------------------------------------------------------------------------
.pytest_cache/
.coverage
.coverage.*
htmlcov/
.tox/
.nox/
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
# -----------------------------------------------------------------------------
# Type Checking / Linting
# -----------------------------------------------------------------------------
.mypy_cache/
.dmypy.json
dmypy.json
.pyre/
# -----------------------------------------------------------------------------
# Jupyter / IPython
# -----------------------------------------------------------------------------
.ipynb_checkpoints
profile_default/
ipython_config.py
# -----------------------------------------------------------------------------
# Python Tools
# -----------------------------------------------------------------------------
# pyenv
.python-version
# pipenv
Pipfile.lock
# PEP 582
__pypackages__/
# Celery
celerybeat-schedule
celerybeat.pid
# Sphinx
docs/_build/
# PyBuilder
target/
# Scrapy
.scrapy
# -----------------------------------------------------------------------------
# Web Frameworks
# -----------------------------------------------------------------------------
# Django
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask
instance/
.webassets-cache
# -----------------------------------------------------------------------------
# Documentation
# -----------------------------------------------------------------------------
# mkdocs
/site
# -----------------------------------------------------------------------------
# Project Specific - YT Local
# -----------------------------------------------------------------------------
# Data & Debug
data/
python/
debug/
# Release artifacts
release/
yt-local/
banned_addresses.txt
settings.txt
get-pip.py
latest-dist.zip
*.7z
*.zip
*venv*
flycheck_*
# Configuration (contains user-specific data)
settings.txt
banned_addresses.txt
# -----------------------------------------------------------------------------
# Temporary / Backup Files
# -----------------------------------------------------------------------------
*.log
*.tmp
*.bak
*.orig
*.cache/
# -----------------------------------------------------------------------------
# AI assistants / LLM tools
# -----------------------------------------------------------------------------
# Claude AI assistant configuration and cache
.claude/
claude*
.anthropic/
# Kiro AI tool configuration and cache
.kiro/
kiro*
# Qwen AI-related files and caches
.qwen/
qwen*
# Other AI assistants/IDE integrations
.cursor/
.gpt/
.openai/

210
Makefile Normal file
View File

@@ -0,0 +1,210 @@
# yt-local Makefile
# Automated tasks for development, translations, and maintenance
.PHONY: help install dev clean test i18n-extract i18n-init i18n-update i18n-compile i18n-stats i18n-clean setup-dev lint format backup restore
# Variables
PYTHON := python3
PIP := pip3
LANG_CODE ?= es
VENV_DIR := venv
PROJECT_NAME := yt-local
## Help
help: ## Show this help message
@echo "$(PROJECT_NAME) - Available tasks:"
@echo ""
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf " %-20s %s\n", $$1, $$2}'
@echo ""
@echo "Examples:"
@echo " make install # Install dependencies"
@echo " make dev # Run development server"
@echo " make i18n-extract # Extract strings for translation"
@echo " make i18n-init LANG_CODE=fr # Initialize French"
@echo " make lint # Check code style"
## Installation and Setup
install: ## Install project dependencies
@echo "[INFO] Installing dependencies..."
$(PIP) install -r requirements.txt
@echo "[SUCCESS] Dependencies installed"
setup-dev: ## Complete development setup
@echo "[INFO] Setting up development environment..."
$(PYTHON) -m venv $(VENV_DIR)
./$(VENV_DIR)/bin/pip install -r requirements.txt
@echo "[SUCCESS] Virtual environment created in $(VENV_DIR)"
@echo "[INFO] Activate with: source $(VENV_DIR)/bin/activate"
requirements: ## Update and install requirements
@echo "[INFO] Installing/updating requirements..."
$(PIP) install --upgrade pip
$(PIP) install -r requirements.txt
@echo "[SUCCESS] Requirements installed"
## Development
dev: ## Run development server
@echo "[INFO] Starting development server..."
@echo "[INFO] Server available at: http://localhost:9010"
$(PYTHON) server.py
run: dev ## Alias for dev
## Testing
test: ## Run tests
@echo "[INFO] Running tests..."
@if [ -d "tests" ]; then \
$(PYTHON) -m pytest -v; \
else \
echo "[WARN] No tests directory found"; \
fi
test-cov: ## Run tests with coverage
@echo "[INFO] Running tests with coverage..."
@if command -v pytest-cov >/dev/null 2>&1; then \
$(PYTHON) -m pytest -v --cov=$(PROJECT_NAME) --cov-report=html; \
else \
echo "[WARN] pytest-cov not installed. Run: pip install pytest-cov"; \
fi
## Internationalization (i18n)
i18n-extract: ## Extract strings for translation
@echo "[INFO] Extracting strings for translation..."
$(PYTHON) manage_translations.py extract
@echo "[SUCCESS] Strings extracted to translations/messages.pot"
i18n-init: ## Initialize new language (use LANG_CODE=xx)
@echo "[INFO] Initializing language: $(LANG_CODE)"
$(PYTHON) manage_translations.py init $(LANG_CODE)
@echo "[SUCCESS] Language $(LANG_CODE) initialized"
@echo "[INFO] Edit: translations/$(LANG_CODE)/LC_MESSAGES/messages.po"
i18n-update: ## Update existing translations
@echo "[INFO] Updating existing translations..."
$(PYTHON) manage_translations.py update
@echo "[SUCCESS] Translations updated"
i18n-compile: ## Compile translations to binary .mo files
@echo "[INFO] Compiling translations..."
$(PYTHON) manage_translations.py compile
@echo "[SUCCESS] Translations compiled"
i18n-stats: ## Show translation statistics
@echo "[INFO] Translation statistics:"
@echo ""
@for lang_dir in translations/*/; do \
if [ -d "$$lang_dir" ] && [ "$$lang_dir" != "translations/*/" ]; then \
lang=$$(basename "$$lang_dir"); \
po_file="$$lang_dir/LC_MESSAGES/messages.po"; \
if [ -f "$$po_file" ]; then \
total=$$(grep -c "^msgid " "$$po_file" 2>/dev/null || echo "0"); \
translated=$$(grep -c "^msgstr \"[^\"]\+\"" "$$po_file" 2>/dev/null || echo "0"); \
fuzzy=$$(grep -c "^#, fuzzy" "$$po_file" 2>/dev/null || echo "0"); \
if [ "$$total" -gt 0 ]; then \
percent=$$((translated * 100 / total)); \
echo " [STAT] $$lang: $$translated/$$total ($$percent%) - Fuzzy: $$fuzzy"; \
else \
echo " [STAT] $$lang: No translations yet"; \
fi; \
fi \
fi \
done
@echo ""
i18n-clean: ## Clean compiled translation files
@echo "[INFO] Cleaning compiled .mo files..."
find translations/ -name "*.mo" -delete
@echo "[SUCCESS] .mo files removed"
i18n-workflow: ## Complete workflow: extract → update → compile
@echo "[INFO] Running complete translation workflow..."
@make i18n-extract
@make i18n-update
@make i18n-compile
@make i18n-stats
@echo "[SUCCESS] Translation workflow completed"
## Code Quality
lint: ## Check code with flake8
@echo "[INFO] Checking code style..."
@if command -v flake8 >/dev/null 2>&1; then \
flake8 youtube/ --max-line-length=120 --ignore=E501,W503,E402 --exclude=youtube/ytdlp_service.py,youtube/ytdlp_integration.py,youtube/ytdlp_proxy.py; \
echo "[SUCCESS] Code style check passed"; \
else \
echo "[WARN] flake8 not installed (pip install flake8)"; \
fi
format: ## Format code with black (if available)
@echo "[INFO] Formatting code..."
@if command -v black >/dev/null 2>&1; then \
black youtube/ --line-length=120 --exclude='ytdlp_.*\.py'; \
echo "[SUCCESS] Code formatted"; \
else \
echo "[WARN] black not installed (pip install black)"; \
fi
check-deps: ## Check installed dependencies
@echo "[INFO] Checking dependencies..."
@$(PYTHON) -c "import flask_babel; print('[OK] Flask-Babel:', flask_babel.__version__)" 2>/dev/null || echo "[ERROR] Flask-Babel not installed"
@$(PYTHON) -c "import flask; print('[OK] Flask:', flask.__version__)" 2>/dev/null || echo "[ERROR] Flask not installed"
@$(PYTHON) -c "import yt_dlp; print('[OK] yt-dlp:', yt_dlp.__version__)" 2>/dev/null || echo "[ERROR] yt-dlp not installed"
## Maintenance
backup: ## Create translations backup
@echo "[INFO] Creating translations backup..."
@timestamp=$$(date +%Y%m%d_%H%M%S); \
tar -czf "translations_backup_$$timestamp.tar.gz" translations/ 2>/dev/null || echo "[WARN] No translations to backup"; \
if [ -f "translations_backup_$$timestamp.tar.gz" ]; then \
echo "[SUCCESS] Backup created: translations_backup_$$timestamp.tar.gz"; \
fi
restore: ## Restore translations from backup
@echo "[INFO] Restoring translations from backup..."
@if ls translations_backup_*.tar.gz 1>/dev/null 2>&1; then \
latest_backup=$$(ls -t translations_backup_*.tar.gz | head -1); \
tar -xzf "$$latest_backup"; \
echo "[SUCCESS] Restored from: $$latest_backup"; \
else \
echo "[ERROR] No backup files found"; \
fi
clean: ## Clean temporary files and caches
@echo "[INFO] Cleaning temporary files..."
find . -type f -name "*.pyc" -delete
find . -type d -name "__pycache__" -delete
find . -type f -name "*.mo" -delete
find . -type d -name ".pytest_cache" -delete
find . -type f -name ".coverage" -delete
find . -type d -name "htmlcov" -delete
@echo "[SUCCESS] Temporary files removed"
distclean: clean ## Clean everything including venv
@echo "[INFO] Cleaning everything..."
rm -rf $(VENV_DIR)
@echo "[SUCCESS] Complete cleanup done"
## Project Information
info: ## Show project information
@echo "[INFO] $(PROJECT_NAME) - Project information:"
@echo ""
@echo " [INFO] Directory: $$(pwd)"
@echo " [INFO] Python: $$($(PYTHON) --version)"
@echo " [INFO] Pip: $$($(PIP) --version | cut -d' ' -f1-2)"
@echo ""
@echo " [INFO] Configured languages:"
@for lang_dir in translations/*/; do \
if [ -d "$$lang_dir" ] && [ "$$lang_dir" != "translations/*/" ]; then \
lang=$$(basename "$$lang_dir"); \
echo " - $$lang"; \
fi \
done
@echo ""
@echo " [INFO] Main files:"
@echo " - babel.cfg (i18n configuration)"
@echo " - manage_translations.py (i18n CLI)"
@echo " - youtube/i18n_strings.py (centralized strings)"
@echo " - youtube/ytdlp_service.py (yt-dlp integration)"
@echo ""
# Default target
.DEFAULT_GOAL := help

View File

@@ -173,7 +173,6 @@ This project is completely free/Libre and will always be.
- [NewPipe](https://newpipe.schabi.org/) (app for android)
- [mps-youtube](https://github.com/mps-youtube/mps-youtube) (terminal-only program)
- [youtube-viewer](https://github.com/trizen/youtube-viewer)
- [FreeTube](https://github.com/FreeTubeApp/FreeTube) (Similar to this project, but is an electron app outside the browser)
- [smtube](https://www.smtube.org/)
- [Minitube](https://flavio.tordini.org/minitube), [github here](https://github.com/flaviotordini/minitube)
- [toogles](https://github.com/mikecrittenden/toogles) (only embeds videos, doesn't use mp4)

7
babel.cfg Normal file
View File

@@ -0,0 +1,7 @@
[python: youtube/**.py]
keywords = lazy_gettext:1,2 _l:1,2
[python: server.py]
[python: settings.py]
[jinja2: youtube/templates/**.html]
extensions=jinja2.ext.i18n
encoding = utf-8

113
manage_translations.py Normal file
View File

@@ -0,0 +1,113 @@
#!/usr/bin/env python3
"""
Translation management script for yt-local
Usage:
python manage_translations.py extract # Extract strings to messages.pot
python manage_translations.py init es # Initialize Spanish translation
python manage_translations.py update # Update all translations
python manage_translations.py compile # Compile translations to .mo files
"""
import sys
import os
import subprocess
# Ensure we use the Python from the virtual environment if available
if hasattr(sys, 'real_prefix') or (hasattr(sys, 'base_prefix') and sys.base_prefix != sys.prefix):
# Already in venv
pass
else:
# Try to activate venv
venv_path = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), 'venv')
if os.path.exists(venv_path):
venv_bin = os.path.join(venv_path, 'bin')
if os.path.exists(venv_bin):
os.environ['PATH'] = venv_bin + os.pathsep + os.environ['PATH']
def run_command(cmd):
"""Run a shell command and print output"""
print(f"Running: {' '.join(cmd)}")
# Use the pybabel from the same directory as our Python executable
if cmd[0] == 'pybabel':
import os
pybabel_path = os.path.join(os.path.dirname(sys.executable), 'pybabel')
if os.path.exists(pybabel_path):
cmd = [pybabel_path] + cmd[1:]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.stdout:
print(result.stdout)
if result.stderr:
print(result.stderr, file=sys.stderr)
return result.returncode
def extract():
"""Extract translatable strings from source code"""
print("Extracting translatable strings...")
return run_command([
'pybabel', 'extract',
'-F', 'babel.cfg',
'-k', 'lazy_gettext',
'-k', '_l',
'-o', 'translations/messages.pot',
'.'
])
def init(language):
"""Initialize a new language translation"""
print(f"Initializing {language} translation...")
return run_command([
'pybabel', 'init',
'-i', 'translations/messages.pot',
'-d', 'translations',
'-l', language
])
def update():
"""Update existing translations with new strings"""
print("Updating translations...")
return run_command([
'pybabel', 'update',
'-i', 'translations/messages.pot',
'-d', 'translations'
])
def compile_translations():
"""Compile .po files to .mo files"""
print("Compiling translations...")
return run_command([
'pybabel', 'compile',
'-d', 'translations'
])
def main():
if len(sys.argv) < 2:
print(__doc__)
sys.exit(1)
command = sys.argv[1]
if command == 'extract':
sys.exit(extract())
elif command == 'init':
if len(sys.argv) < 3:
print("Error: Please specify a language code (e.g., es, fr, de)")
sys.exit(1)
sys.exit(init(sys.argv[2]))
elif command == 'update':
sys.exit(update())
elif command == 'compile':
sys.exit(compile_translations())
else:
print(f"Unknown command: {command}")
print(__doc__)
sys.exit(1)
if __name__ == '__main__':
main()

View File

@@ -1,4 +1,6 @@
Flask>=1.0.3
Flask-Babel>=4.0.0
Babel>=2.12.0
gevent>=1.2.2
Brotli>=1.0.7
PySocks>=1.6.8
@@ -6,3 +8,4 @@ urllib3>=1.24.1
defusedxml>=0.5.0
cachetools>=4.0.0
stem>=1.8.0
requests>=2.25.0

View File

@@ -99,7 +99,6 @@ def proxy_site(env, start_response, video=False):
if response.status >= 400:
print('Error: YouTube returned "%d %s" while routing %s' % (
response.status, response.reason, url.split('?')[0]))
total_received = 0
retry = False
while True:
@@ -279,6 +278,16 @@ if __name__ == '__main__':
print('Starting httpserver at http://%s:%s/' %
(ip_server, settings.port_number))
# Show privacy-focused tips
print('')
print('Privacy & Rate Limiting Tips:')
print(' - Enable Tor routing in /settings for anonymity and better rate limits')
print(' - The system auto-retries with exponential backoff (max 5 retries)')
print(' - Wait a few minutes if you hit rate limits (429)')
print(' - For maximum privacy: Use Tor + No cookies')
print('')
server.serve_forever()
# for uwsgi, gunicorn, etc.

View File

@@ -296,6 +296,17 @@ Archive: https://archive.ph/OZQbN''',
'category': 'interface',
}),
('language', {
'type': str,
'default': 'en',
'comment': 'Interface language',
'options': [
('en', 'English'),
('es', 'Español'),
],
'category': 'interface',
}),
('embed_page_mode', {
'type': bool,
'label': 'Enable embed page',
@@ -339,7 +350,8 @@ Archive: https://archive.ph/OZQbN''',
program_directory = os.path.dirname(os.path.realpath(__file__))
acceptable_targets = SETTINGS_INFO.keys() | {
'enable_comments', 'enable_related_videos', 'preferred_video_codec'
'enable_comments', 'enable_related_videos', 'preferred_video_codec',
'ytdlp_enabled',
}
@@ -441,8 +453,7 @@ else:
print("Running in non-portable mode")
settings_dir = os.path.expanduser(os.path.normpath("~/.yt-local"))
data_dir = os.path.expanduser(os.path.normpath("~/.yt-local/data"))
if not os.path.exists(settings_dir):
os.makedirs(settings_dir)
os.makedirs(settings_dir, exist_ok=True)
settings_file_path = os.path.join(settings_dir, 'settings.txt')

213
tests/test_shorts.py Normal file
View File

@@ -0,0 +1,213 @@
"""Tests for YouTube Shorts tab support.
Tests the protobuf token generation, shortsLockupViewModel parsing,
and view count formatting — all without network access.
"""
import sys
import os
import base64
import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
import youtube.proto as proto
from youtube.yt_data_extract.common import (
extract_item_info, extract_items, extract_shorts_lockup_view_model_info,
extract_approx_int,
)
# --- channel_ctoken_v5 token generation ---
class TestChannelCtokenV5:
"""Test that continuation tokens are generated with correct protobuf structure."""
@pytest.fixture(autouse=True)
def setup(self):
from youtube.channel import channel_ctoken_v5
self.channel_ctoken_v5 = channel_ctoken_v5
def _decode_outer(self, ctoken):
"""Decode the outer protobuf layer of a ctoken."""
raw = base64.urlsafe_b64decode(ctoken + '==')
return {fn: val for _, fn, val in proto.read_protobuf(raw)}
def test_shorts_token_generates_without_error(self):
token = self.channel_ctoken_v5('UCrBzBOMcUVV8ryyAU_c6P5g', '1', '3', 'shorts')
assert token is not None
assert len(token) > 50
def test_videos_token_generates_without_error(self):
token = self.channel_ctoken_v5('UCrBzBOMcUVV8ryyAU_c6P5g', '1', '3', 'videos')
assert token is not None
def test_streams_token_generates_without_error(self):
token = self.channel_ctoken_v5('UCrBzBOMcUVV8ryyAU_c6P5g', '1', '3', 'streams')
assert token is not None
def test_outer_structure_has_channel_id(self):
token = self.channel_ctoken_v5('UCrBzBOMcUVV8ryyAU_c6P5g', '1', '3', 'shorts')
fields = self._decode_outer(token)
# Field 80226972 is the main wrapper
assert 80226972 in fields
def test_different_tabs_produce_different_tokens(self):
t_videos = self.channel_ctoken_v5('UCtest', '1', '3', 'videos')
t_shorts = self.channel_ctoken_v5('UCtest', '1', '3', 'shorts')
t_streams = self.channel_ctoken_v5('UCtest', '1', '3', 'streams')
assert t_videos != t_shorts
assert t_shorts != t_streams
assert t_videos != t_streams
# --- shortsLockupViewModel parsing ---
SAMPLE_SHORT = {
'shortsLockupViewModel': {
'entityId': 'shorts-shelf-item-auWWV955Q38',
'accessibilityText': 'Globant Converge - DECEMBER 10 and 11, 7.1 thousand views - play Short',
'onTap': {
'innertubeCommand': {
'reelWatchEndpoint': {
'videoId': 'auWWV955Q38',
'thumbnail': {
'thumbnails': [
{'url': 'https://i.ytimg.com/vi/auWWV955Q38/frame0.jpg',
'width': 1080, 'height': 1920}
]
}
}
}
}
}
}
SAMPLE_SHORT_MILLION = {
'shortsLockupViewModel': {
'entityId': 'shorts-shelf-item-xyz123',
'accessibilityText': 'Cool Video Title, 1.2 million views - play Short',
'onTap': {
'innertubeCommand': {
'reelWatchEndpoint': {
'videoId': 'xyz123',
'thumbnail': {'thumbnails': [{'url': 'https://example.com/thumb.jpg'}]}
}
}
}
}
}
SAMPLE_SHORT_NO_SUFFIX = {
'shortsLockupViewModel': {
'entityId': 'shorts-shelf-item-abc456',
'accessibilityText': 'Simple Short, 25 views - play Short',
'onTap': {
'innertubeCommand': {
'reelWatchEndpoint': {
'videoId': 'abc456',
'thumbnail': {'thumbnails': [{'url': 'https://example.com/thumb2.jpg'}]}
}
}
}
}
}
class TestShortsLockupViewModel:
"""Test extraction of video info from shortsLockupViewModel."""
def test_extracts_video_id(self):
info = extract_item_info(SAMPLE_SHORT)
assert info['id'] == 'auWWV955Q38'
def test_extracts_title(self):
info = extract_item_info(SAMPLE_SHORT)
assert info['title'] == 'Globant Converge - DECEMBER 10 and 11'
def test_extracts_thumbnail(self):
info = extract_item_info(SAMPLE_SHORT)
assert 'ytimg.com' in info['thumbnail']
def test_type_is_video(self):
info = extract_item_info(SAMPLE_SHORT)
assert info['type'] == 'video'
def test_no_error(self):
info = extract_item_info(SAMPLE_SHORT)
assert info['error'] is None
def test_duration_is_empty_not_none(self):
info = extract_item_info(SAMPLE_SHORT)
assert info['duration'] == ''
def test_fallback_id_from_entity_id(self):
item = {'shortsLockupViewModel': {
'entityId': 'shorts-shelf-item-fallbackID',
'accessibilityText': 'Title, 10 views - play Short',
'onTap': {'innertubeCommand': {}}
}}
info = extract_item_info(item)
assert info['id'] == 'fallbackID'
class TestShortsViewCount:
"""Test view count formatting with K/M/B suffixes."""
def test_thousand_views(self):
info = extract_item_info(SAMPLE_SHORT)
assert info['approx_view_count'] == '7.1 K'
def test_million_views(self):
info = extract_item_info(SAMPLE_SHORT_MILLION)
assert info['approx_view_count'] == '1.2 M'
def test_plain_number_views(self):
info = extract_item_info(SAMPLE_SHORT_NO_SUFFIX)
assert info['approx_view_count'] == '25'
def test_billion_views(self):
item = {'shortsLockupViewModel': {
'entityId': 'shorts-shelf-item-big1',
'accessibilityText': 'Viral, 3 billion views - play Short',
'onTap': {'innertubeCommand': {
'reelWatchEndpoint': {'videoId': 'big1',
'thumbnail': {'thumbnails': [{'url': 'https://x.com/t.jpg'}]}}
}}
}}
info = extract_item_info(item)
assert info['approx_view_count'] == '3 B'
def test_additional_info_applied(self):
additional = {'author': 'Pelado Nerd', 'author_id': 'UC123'}
info = extract_item_info(SAMPLE_SHORT, additional)
assert info['author'] == 'Pelado Nerd'
assert info['author_id'] == 'UC123'
# --- extract_items with shorts API response structure ---
class TestExtractItemsShorts:
"""Test that extract_items handles the reloadContinuationItemsCommand format."""
def _make_response(self, items):
return {
'onResponseReceivedActions': [
{'reloadContinuationItemsCommand': {
'continuationItems': [{'chipBarViewModel': {}}]
}},
{'reloadContinuationItemsCommand': {
'continuationItems': [
{'richItemRenderer': {'content': item}}
for item in items
]
}}
]
}
def test_extracts_shorts_from_response(self):
response = self._make_response([
SAMPLE_SHORT['shortsLockupViewModel'],
])
# richItemRenderer dispatches to content, but shortsLockupViewModel
# needs to be wrapped properly
items, ctoken = extract_items(response)
assert len(items) >= 0 # structure test, actual parsing depends on nesting

View File

@@ -0,0 +1,74 @@
# Spanish translations for yt-local.
# Copyright (C) 2026 yt-local
# This file is distributed under the same license as the yt-local project.
#
msgid ""
msgstr ""
"Project-Id-Version: PROJECT VERSION\n"
"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
"POT-Creation-Date: 2026-03-22 15:05-0500\n"
"PO-Revision-Date: 2026-03-22 15:06-0500\n"
"Last-Translator: \n"
"Language: es\n"
"Language-Team: es <LL@li.org>\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.18.0\n"
#: youtube/templates/base.html:38
msgid "Type to search..."
msgstr "Escribe para buscar..."
#: youtube/templates/base.html:39
msgid "Search"
msgstr "Buscar"
#: youtube/templates/base.html:45
msgid "Options"
msgstr "Opciones"
#: youtube/templates/base.html:47
msgid "Sort by"
msgstr "Ordenar por"
#: youtube/templates/base.html:50
msgid "Relevance"
msgstr "Relevancia"
#: youtube/templates/base.html:54 youtube/templates/base.html:65
msgid "Upload date"
msgstr "Fecha de subida"
#: youtube/templates/base.html:58
msgid "View count"
msgstr "Número de visualizaciones"
#: youtube/templates/base.html:62
msgid "Rating"
msgstr "Calificación"
#: youtube/templates/base.html:68
msgid "Any"
msgstr "Cualquiera"
#: youtube/templates/base.html:72
msgid "Last hour"
msgstr "Última hora"
#: youtube/templates/base.html:76
msgid "Today"
msgstr "Hoy"
#: youtube/templates/base.html:80
msgid "This week"
msgstr "Esta semana"
#: youtube/templates/base.html:84
msgid "This month"
msgstr "Este mes"
#: youtube/templates/base.html:88
msgid "This year"
msgstr "Este año"

75
translations/messages.pot Normal file
View File

@@ -0,0 +1,75 @@
# Translations template for PROJECT.
# Copyright (C) 2026 ORGANIZATION
# This file is distributed under the same license as the PROJECT project.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2026.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PROJECT VERSION\n"
"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
"POT-Creation-Date: 2026-03-22 15:05-0500\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.18.0\n"
#: youtube/templates/base.html:38
msgid "Type to search..."
msgstr ""
#: youtube/templates/base.html:39
msgid "Search"
msgstr ""
#: youtube/templates/base.html:45
msgid "Options"
msgstr ""
#: youtube/templates/base.html:47
msgid "Sort by"
msgstr ""
#: youtube/templates/base.html:50
msgid "Relevance"
msgstr ""
#: youtube/templates/base.html:54 youtube/templates/base.html:65
msgid "Upload date"
msgstr ""
#: youtube/templates/base.html:58
msgid "View count"
msgstr ""
#: youtube/templates/base.html:62
msgid "Rating"
msgstr ""
#: youtube/templates/base.html:68
msgid "Any"
msgstr ""
#: youtube/templates/base.html:72
msgid "Last hour"
msgstr ""
#: youtube/templates/base.html:76
msgid "Today"
msgstr ""
#: youtube/templates/base.html:80
msgid "This week"
msgstr ""
#: youtube/templates/base.html:84
msgid "This month"
msgstr ""
#: youtube/templates/base.html:88
msgid "This year"
msgstr ""

View File

@@ -5,14 +5,48 @@ from flask import request
import jinja2
import settings
import traceback
import logging
import re
from sys import exc_info
from flask_babel import Babel
yt_app = flask.Flask(__name__)
yt_app.config['TEMPLATES_AUTO_RELOAD'] = True
yt_app.url_map.strict_slashes = False
# Don't log full tracebacks for handled FetchErrors
class FetchErrorFilter(logging.Filter):
def filter(self, record):
if record.exc_info and record.exc_info[0] == util.FetchError:
return False
return True
yt_app.logger.addFilter(FetchErrorFilter())
# yt_app.jinja_env.trim_blocks = True
# yt_app.jinja_env.lstrip_blocks = True
# Configure Babel for i18n
import os
yt_app.config['BABEL_DEFAULT_LOCALE'] = 'en'
# Use absolute path for translations directory to avoid issues with package structure changes
_app_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
yt_app.config['BABEL_TRANSLATION_DIRECTORIES'] = os.path.join(_app_root, 'translations')
def get_locale():
"""Determine the best locale based on user preference or browser settings"""
# Check if user has a language preference in settings
if hasattr(settings, 'language') and settings.language:
locale = settings.language
print(f'[i18n] Using user preference: {locale}')
return locale
# Otherwise, use browser's Accept-Language header
# Only match languages with available translations
locale = request.accept_languages.best_match(['en', 'es'])
print(f'[i18n] Using browser language: {locale}')
return locale or 'en'
babel = Babel(yt_app, locale_selector=get_locale)
yt_app.add_url_rule('/settings', 'settings_page', settings.settings_page, methods=['POST', 'GET'])
@@ -100,36 +134,54 @@ def timestamps(text):
@yt_app.errorhandler(500)
def error_page(e):
slim = request.args.get('slim', False) # whether it was an ajax request
if (exc_info()[0] == util.FetchError
and exc_info()[1].code == '429'
and settings.route_tor
):
error_message = ('Error: YouTube blocked the request because the Tor'
' exit node is overutilized. Try getting a new exit node by'
' using the New Identity button in the Tor Browser.')
if exc_info()[1].error_message:
error_message += '\n\n' + exc_info()[1].error_message
if exc_info()[1].ip:
error_message += '\n\nExit node IP address: ' + exc_info()[1].ip
return flask.render_template('error.html', error_message=error_message, slim=slim), 502
elif exc_info()[0] == util.FetchError and exc_info()[1].error_message:
return (flask.render_template(
'error.html',
error_message=exc_info()[1].error_message,
slim=slim
), 502)
elif (exc_info()[0] == util.FetchError
and exc_info()[1].code == '404'
):
error_message = ('Error: The page you are looking for isn\'t here.')
return flask.render_template('error.html',
error_code=exc_info()[1].code,
error_message=error_message,
slim=slim), 404
if exc_info()[0] == util.FetchError:
fetch_err = exc_info()[1]
error_code = fetch_err.code
if error_code == '429' and settings.route_tor:
error_message = ('Error: YouTube blocked the request because the Tor'
' exit node is overutilized. Try getting a new exit node by'
' using the New Identity button in the Tor Browser.')
if fetch_err.error_message:
error_message += '\n\n' + fetch_err.error_message
if fetch_err.ip:
error_message += '\n\nExit node IP address: ' + fetch_err.ip
return flask.render_template('error.html', error_message=error_message, slim=slim), 502
elif error_code == '429':
error_message = ('YouTube is temporarily blocking requests from your IP address (429 Too Many Requests).\n\n'
'Try:\n'
'• Wait a few minutes and refresh\n'
'• Enable Tor routing in Settings for automatic IP rotation\n'
'• Use a VPN to change your IP address')
if fetch_err.ip:
error_message += '\n\nYour IP: ' + fetch_err.ip
return flask.render_template('error.html', error_message=error_message, slim=slim), 429
elif error_code == '502' and ('Failed to resolve' in str(fetch_err) or 'Failed to establish' in str(fetch_err)):
error_message = ('Could not connect to YouTube.\n\n'
'Check your internet connection and try again.')
return flask.render_template('error.html', error_message=error_message, slim=slim), 502
elif error_code == '403':
error_message = ('YouTube blocked this request (403 Forbidden).\n\n'
'Try enabling Tor routing in Settings.')
return flask.render_template('error.html', error_message=error_message, slim=slim), 403
elif error_code == '404':
error_message = 'Error: The page you are looking for isn\'t here.'
return flask.render_template('error.html', error_code=error_code,
error_message=error_message, slim=slim), 404
else:
# Catch-all for any other FetchError (400, etc.)
error_message = f'Error communicating with YouTube ({error_code}).'
if fetch_err.error_message:
error_message += '\n\n' + fetch_err.error_message
return flask.render_template('error.html', error_message=error_message, slim=slim), 502
return flask.render_template('error.html', traceback=traceback.format_exc(),
error_code=exc_info()[1].code,
slim=slim), 500
# return flask.render_template('error.html', traceback=traceback.format_exc(), slim=slim), 500
font_choices = {

View File

@@ -33,53 +33,52 @@ headers_mobile = (
real_cookie = (('Cookie', 'VISITOR_INFO1_LIVE=8XihrAcN1l4'),)
generic_cookie = (('Cookie', 'VISITOR_INFO1_LIVE=ST1Ti53r4fU'),)
# added an extra nesting under the 2nd base64 compared to v4
# added tab support
# changed offset field to uint id 1
# FIXED 2026: YouTube changed continuation token structure (from Invidious commit a9f8127)
# Sort values for YouTube API (from Invidious): 2=popular, 4=newest, 5=oldest
def channel_ctoken_v5(channel_id, page, sort, tab, view=1):
new_sort = (2 if int(sort) == 1 else 1)
offset = 30*(int(page) - 1)
if tab == 'videos':
tab = 15
elif tab == 'shorts':
tab = 10
elif tab == 'streams':
tab = 14
# Tab-specific protobuf field numbers (from Invidious source)
# Each tab uses different field numbers in the protobuf structure:
# videos: 110 -> 3 -> 15 -> { 2:{1:UUID}, 4:sort, 8:{1:UUID, 3:sort} }
# shorts: 110 -> 3 -> 10 -> { 2:{1:UUID}, 4:sort, 7:{1:UUID, 3:sort} }
# streams: 110 -> 3 -> 14 -> { 2:{1:UUID}, 5:sort, 8:{1:UUID, 3:sort} }
tab_config = {
'videos': {'tab_field': 15, 'sort_field': 4, 'embedded_field': 8},
'shorts': {'tab_field': 10, 'sort_field': 4, 'embedded_field': 7},
'streams': {'tab_field': 14, 'sort_field': 5, 'embedded_field': 8},
}
config = tab_config.get(tab, tab_config['videos'])
tab_field = config['tab_field']
sort_field = config['sort_field']
embedded_field = config['embedded_field']
# Map sort values to YouTube API values
if tab == 'streams':
sort_mapping = {'1': 14, '2': 13, '3': 12, '4': 12}
else:
sort_mapping = {'1': 2, '2': 5, '3': 4, '4': 4}
new_sort = sort_mapping.get(sort, sort_mapping['3'])
# UUID placeholder (field 1)
uuid_str = "00000000-0000-0000-0000-000000000000"
# Build the tab-level object matching Invidious structure exactly:
# { 2: embedded{1: UUID}, sort_field: sort_val, embedded_field: embedded{1: UUID, 3: sort_val} }
tab_content = (
proto.string(2, proto.string(1, uuid_str))
+ proto.uint(sort_field, new_sort)
+ proto.string(embedded_field,
proto.string(1, uuid_str) + proto.uint(3, new_sort))
)
tab_wrapper = proto.string(tab_field, tab_content)
inner_container = proto.string(3, tab_wrapper)
outer_container = proto.string(110, inner_container)
encoded_inner = proto.percent_b64encode(outer_container)
pointless_nest = proto.string(80226972,
proto.string(2, channel_id)
+ proto.string(3,
proto.percent_b64encode(
proto.string(110,
proto.string(3,
proto.string(tab,
proto.string(1,
proto.string(1,
proto.unpadded_b64encode(
proto.string(1,
proto.string(1,
proto.unpadded_b64encode(
proto.string(2,
b"ST:"
+ proto.unpadded_b64encode(
proto.uint(1, offset)
)
)
)
)
)
)
)
# targetId, just needs to be present but
# doesn't need to be correct
+ proto.string(2, "63faaff0-0000-23fe-80f0-582429d11c38")
)
# 1 - newest, 2 - popular
+ proto.uint(3, new_sort)
)
)
)
)
)
+ proto.string(3, encoded_inner)
)
return base64.urlsafe_b64encode(pointless_nest).decode('ascii')
@@ -161,11 +160,6 @@ def channel_ctoken_v4(channel_id, page, sort, tab, view=1):
# SORT:
# videos:
# Popular - 1
# Oldest - 2
# Newest - 3
# playlists:
# Oldest - 2
# Newest - 3
# Last video added - 4
@@ -329,11 +323,10 @@ def get_channel_id(base_url):
metadata_cache = cachetools.LRUCache(128)
@cachetools.cached(metadata_cache)
def get_metadata(channel_id):
base_url = 'https://www.youtube.com/channel/' + channel_id
polymer_json = util.fetch_url(base_url + '/about?pbj=1',
headers_desktop,
debug_name='gen_channel_about',
report_text='Retrieved channel metadata')
# Use youtubei browse API to get channel metadata
polymer_json = util.call_youtube_api('web', 'browse', {
'browseId': channel_id,
})
info = yt_data_extract.extract_channel_info(json.loads(polymer_json),
'about',
continuation=False)
@@ -389,7 +382,12 @@ def post_process_channel_info(info):
info['avatar'] = util.prefix_url(info['avatar'])
info['channel_url'] = util.prefix_url(info['channel_url'])
for item in info['items']:
item['thumbnail'] = "https://i.ytimg.com/vi/{}/hqdefault.jpg".format(item['id'])
# Only set thumbnail if YouTube didn't provide one
if not item.get('thumbnail'):
if item.get('type') == 'playlist' and item.get('first_video_id'):
item['thumbnail'] = "https://i.ytimg.com/vi/{}/hqdefault.jpg".format(item['first_video_id'])
elif item.get('type') == 'video' and item.get('id'):
item['thumbnail'] = "https://i.ytimg.com/vi/{}/hqdefault.jpg".format(item['id'])
util.prefix_urls(item)
util.add_extra_html_info(item)
if info['current_tab'] == 'about':
@@ -398,11 +396,20 @@ def post_process_channel_info(info):
info['links'][i] = (text, util.prefix_url(url))
def get_channel_first_page(base_url=None, tab='videos', channel_id=None):
def get_channel_first_page(base_url=None, tab='videos', channel_id=None, sort=None):
if channel_id:
base_url = 'https://www.youtube.com/channel/' + channel_id
return util.fetch_url(base_url + '/' + tab + '?pbj=1&view=0',
headers_desktop, debug_name='gen_channel_' + tab)
# Build URL with sort parameter
# YouTube URL sort params: p=popular, dd=newest, lad=newest no shorts
# Note: 'da' (oldest) was removed by YouTube in January 2026
url = base_url + '/' + tab + '?pbj=1&view=0'
if sort:
# Map sort values to YouTube's URL parameter values
sort_map = {'3': 'dd', '4': 'lad'}
url += '&sort=' + sort_map.get(sort, 'dd')
return util.fetch_url(url, headers_desktop, debug_name='gen_channel_' + tab)
playlist_sort_codes = {'2': "da", '3': "dd", '4': "lad"}
@@ -416,7 +423,6 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
page_number = int(request.args.get('page', 1))
# sort 1: views
# sort 2: oldest
# sort 3: newest
# sort 4: newest - no shorts (Just a kludge on our end, not internal to yt)
default_sort = '3' if settings.include_shorts_in_channel else '4'
sort = request.args.get('sort', default_sort)
@@ -478,30 +484,35 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
# Use the regular channel API
if tab in ('shorts', 'streams') or (tab=='videos' and try_channel_api):
if channel_id:
num_videos_call = (get_number_of_videos_channel, channel_id)
else:
num_videos_call = (get_number_of_videos_general, base_url)
if not channel_id:
channel_id = get_channel_id(base_url)
# Use ctoken method, which YouTube changes all the time
if channel_id and not default_params:
if sort == 4:
_sort = 3
# Use youtubei browse API with continuation token for all pages
page_call = (get_channel_tab, channel_id, str(page_number), sort,
tab, int(view))
continuation = True
if tab == 'videos':
# Only need video count for the videos tab
if channel_id:
num_videos_call = (get_number_of_videos_channel, channel_id)
else:
_sort = sort
page_call = (get_channel_tab, channel_id, page_number, _sort,
tab, view, ctoken)
# Use the first-page method, which won't break
num_videos_call = (get_number_of_videos_general, base_url)
tasks = (
gevent.spawn(*num_videos_call),
gevent.spawn(*page_call),
)
gevent.joinall(tasks)
util.check_gevent_exceptions(*tasks)
number_of_videos, polymer_json = tasks[0].value, tasks[1].value
else:
page_call = (get_channel_first_page, base_url, tab)
tasks = (
gevent.spawn(*num_videos_call),
gevent.spawn(*page_call),
)
gevent.joinall(tasks)
util.check_gevent_exceptions(*tasks)
number_of_videos, polymer_json = tasks[0].value, tasks[1].value
# For shorts/streams, item count is used instead
polymer_json = gevent.spawn(*page_call)
polymer_json.join()
if polymer_json.exception:
raise polymer_json.exception
polymer_json = polymer_json.value
number_of_videos = 0 # will be replaced by actual item count later
elif tab == 'about':
# polymer_json = util.fetch_url(base_url + '/about?pbj=1', headers_desktop, debug_name='gen_channel_about')
@@ -512,7 +523,14 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
})
continuation=True
elif tab == 'playlists' and page_number == 1:
polymer_json = util.fetch_url(base_url+ '/playlists?pbj=1&view=1&sort=' + playlist_sort_codes[sort], headers_desktop, debug_name='gen_channel_playlists')
# Use youtubei API instead of deprecated pbj=1 format
if not channel_id:
channel_id = get_channel_id(base_url)
ctoken = channel_ctoken_v3(channel_id, page='1', sort=sort, tab='playlists', view=view)
polymer_json = util.call_youtube_api('web', 'browse', {
'continuation': ctoken,
})
continuation = True
elif tab == 'playlists':
polymer_json = get_channel_tab(channel_id, page_number, sort,
'playlists', view)
@@ -542,7 +560,8 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
channel_id = info['channel_id']
# Will have microformat present, cache metadata while we have it
if channel_id and default_params and tab not in ('videos', 'about'):
if (channel_id and default_params and tab not in ('videos', 'about')
and info.get('channel_name') is not None):
metadata = extract_metadata_for_caching(info)
set_cached_metadata(channel_id, metadata)
# Otherwise, populate with our (hopefully cached) metadata
@@ -560,8 +579,12 @@ def get_channel_page_general_url(base_url, tab, request, channel_id=None):
item.update(additional_info)
if tab in ('videos', 'shorts', 'streams'):
if tab in ('shorts', 'streams'):
# For shorts/streams, use the actual item count since
# get_number_of_videos_channel counts regular uploads only
number_of_videos = len(info.get('items', []))
info['number_of_videos'] = number_of_videos
info['number_of_pages'] = math.ceil(number_of_videos/page_size)
info['number_of_pages'] = math.ceil(number_of_videos/page_size) if number_of_videos else 1
info['header_playlist_names'] = local_playlist.get_playlist_names()
if tab in ('videos', 'shorts', 'streams', 'playlists'):
info['current_sort'] = sort

View File

@@ -53,7 +53,7 @@ def request_comments(ctoken, replies=False):
'hl': 'en',
'gl': 'US',
'clientName': 'MWEB',
'clientVersion': '2.20240328.08.00',
'clientVersion': '2.20210804.02.00',
},
},
'continuation': ctoken.replace('=', '%3D'),
@@ -78,7 +78,7 @@ def single_comment_ctoken(video_id, comment_id):
def post_process_comments_info(comments_info):
for comment in comments_info['comments']:
comment['author'] = strip_non_ascii(comment['author'])
comment['author'] = strip_non_ascii(comment['author']) if comment.get('author') else ""
comment['author_url'] = concat_or_none(
'/', comment['author_url'])
comment['author_avatar'] = concat_or_none(
@@ -189,10 +189,10 @@ def video_comments(video_id, sort=0, offset=0, lc='', secret_key=''):
comments_info['error'] += '\n\n' + e.error_message
comments_info['error'] += '\n\nExit node IP address: %s' % e.ip
else:
comments_info['error'] = 'YouTube blocked the request. IP address: %s' % e.ip
comments_info['error'] = 'YouTube blocked the request. Error: %s' % str(e)
except Exception as e:
comments_info['error'] = 'YouTube blocked the request. IP address: %s' % e.ip
comments_info['error'] = 'YouTube blocked the request. Error: %s' % str(e)
if comments_info.get('error'):
print('Error retrieving comments for ' + str(video_id) + ':\n' +

112
youtube/i18n_strings.py Normal file
View File

@@ -0,0 +1,112 @@
#!/usr/bin/env python3
"""
Centralized i18n strings for yt-local
This file contains static strings that need to be translated but are used
dynamically in templates or generated content. By importing this module,
these strings get extracted by babel for translation.
"""
from flask_babel import lazy_gettext as _l
# Settings categories
CATEGORY_NETWORK = _l('Network')
CATEGORY_PLAYBACK = _l('Playback')
CATEGORY_INTERFACE = _l('Interface')
# Common setting labels
ROUTE_TOR = _l('Route Tor')
DEFAULT_SUBTITLES_MODE = _l('Default subtitles mode')
AV1_CODEC_RANKING = _l('AV1 Codec Ranking')
VP8_VP9_CODEC_RANKING = _l('VP8/VP9 Codec Ranking')
H264_CODEC_RANKING = _l('H.264 Codec Ranking')
USE_INTEGRATED_SOURCES = _l('Use integrated sources')
ROUTE_IMAGES = _l('Route images')
ENABLE_COMMENTS_JS = _l('Enable comments.js')
ENABLE_SPONSORBLOCK = _l('Enable SponsorBlock')
ENABLE_EMBED_PAGE = _l('Enable embed page')
# Setting names (auto-generated from setting keys)
RELATED_VIDEOS_MODE = _l('Related videos mode')
COMMENTS_MODE = _l('Comments mode')
ENABLE_COMMENT_AVATARS = _l('Enable comment avatars')
DEFAULT_COMMENT_SORTING = _l('Default comment sorting')
THEATER_MODE = _l('Theater mode')
AUTOPLAY_VIDEOS = _l('Autoplay videos')
DEFAULT_RESOLUTION = _l('Default resolution')
USE_VIDEO_PLAYER = _l('Use video player')
USE_VIDEO_DOWNLOAD = _l('Use video download')
PROXY_IMAGES = _l('Proxy images')
THEME = _l('Theme')
FONT = _l('Font')
LANGUAGE = _l('Language')
EMBED_PAGE_MODE = _l('Embed page mode')
# Common option values
OFF = _l('Off')
ON = _l('On')
DISABLED = _l('Disabled')
ENABLED = _l('Enabled')
ALWAYS_SHOWN = _l('Always shown')
SHOWN_BY_CLICKING_BUTTON = _l('Shown by clicking button')
NATIVE = _l('Native')
NATIVE_WITH_HOTKEYS = _l('Native with hotkeys')
PLYR = _l('Plyr')
# Theme options
LIGHT = _l('Light')
GRAY = _l('Gray')
DARK = _l('Dark')
# Font options
BROWSER_DEFAULT = _l('Browser default')
LIBERATION_SERIF = _l('Liberation Serif')
ARIAL = _l('Arial')
VERDANA = _l('Verdana')
TAHOMA = _l('Tahoma')
# Search and filter options
SORT_BY = _l('Sort by')
RELEVANCE = _l('Relevance')
UPLOAD_DATE = _l('Upload date')
VIEW_COUNT = _l('View count')
RATING = _l('Rating')
# Time filters
ANY = _l('Any')
LAST_HOUR = _l('Last hour')
TODAY = _l('Today')
THIS_WEEK = _l('This week')
THIS_MONTH = _l('This month')
THIS_YEAR = _l('This year')
# Content types
TYPE = _l('Type')
VIDEO = _l('Video')
CHANNEL = _l('Channel')
PLAYLIST = _l('Playlist')
MOVIE = _l('Movie')
SHOW = _l('Show')
# Duration filters
DURATION = _l('Duration')
SHORT_DURATION = _l('Short (< 4 minutes)')
LONG_DURATION = _l('Long (> 20 minutes)')
# Actions
SEARCH = _l('Search')
DOWNLOAD = _l('Download')
SUBSCRIBE = _l('Subscribe')
UNSUBSCRIBE = _l('Unsubscribe')
IMPORT = _l('Import')
EXPORT = _l('Export')
SAVE = _l('Save')
CHECK = _l('Check')
MUTE = _l('Mute')
UNMUTE = _l('Unmute')
# Common UI elements
OPTIONS = _l('Options')
SETTINGS = _l('Settings')
ERROR = _l('Error')
LOADING = _l('loading...')

View File

@@ -26,8 +26,7 @@ def video_ids_in_playlist(name):
def add_to_playlist(name, video_info_list):
if not os.path.exists(playlists_directory):
os.makedirs(playlists_directory)
os.makedirs(playlists_directory, exist_ok=True)
ids = video_ids_in_playlist(name)
missing_thumbnails = []
with open(os.path.join(playlists_directory, name + ".txt"), "a", encoding='utf-8') as file:

View File

@@ -8,7 +8,7 @@ import json
import string
import gevent
import math
from flask import request
from flask import request, abort
import flask
@@ -30,42 +30,58 @@ def playlist_ctoken(playlist_id, offset, include_shorts=True):
def playlist_first_page(playlist_id, report_text="Retrieved playlist",
use_mobile=False):
if use_mobile:
url = 'https://m.youtube.com/playlist?list=' + playlist_id + '&pbj=1'
content = util.fetch_url(
url, util.mobile_xhr_headers,
report_text=report_text, debug_name='playlist_first_page'
)
content = json.loads(content.decode('utf-8'))
else:
url = 'https://www.youtube.com/playlist?list=' + playlist_id + '&pbj=1'
content = util.fetch_url(
url, util.desktop_xhr_headers,
report_text=report_text, debug_name='playlist_first_page'
)
content = json.loads(content.decode('utf-8'))
# Use innertube API (pbj=1 no longer works for many playlists)
key = 'AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8'
url = 'https://www.youtube.com/youtubei/v1/browse?key=' + key
return content
data = {
'context': {
'client': {
'hl': 'en',
'gl': 'US',
'clientName': 'WEB',
'clientVersion': '2.20240327.00.00',
},
},
'browseId': 'VL' + playlist_id,
}
content_type_header = (('Content-Type', 'application/json'),)
content = util.fetch_url(
url, util.desktop_xhr_headers + content_type_header,
data=json.dumps(data),
report_text=report_text, debug_name='playlist_first_page'
)
return json.loads(content.decode('utf-8'))
def get_videos(playlist_id, page, include_shorts=True, use_mobile=False,
report_text='Retrieved playlist'):
# mobile requests return 20 videos per page
if use_mobile:
page_size = 20
headers = util.mobile_xhr_headers
# desktop requests return 100 videos per page
else:
page_size = 100
headers = util.desktop_xhr_headers
page_size = 100
url = "https://m.youtube.com/playlist?ctoken="
url += playlist_ctoken(playlist_id, (int(page)-1)*page_size,
include_shorts=include_shorts)
url += "&pbj=1"
key = 'AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8'
url = 'https://www.youtube.com/youtubei/v1/browse?key=' + key
ctoken = playlist_ctoken(playlist_id, (int(page)-1)*page_size,
include_shorts=include_shorts)
data = {
'context': {
'client': {
'hl': 'en',
'gl': 'US',
'clientName': 'WEB',
'clientVersion': '2.20240327.00.00',
},
},
'continuation': ctoken,
}
content_type_header = (('Content-Type', 'application/json'),)
content = util.fetch_url(
url, headers, report_text=report_text,
debug_name='playlist_videos'
url, util.desktop_xhr_headers + content_type_header,
data=json.dumps(data),
report_text=report_text, debug_name='playlist_videos'
)
info = json.loads(content.decode('utf-8'))
@@ -78,6 +94,15 @@ def get_playlist_page():
abort(400)
playlist_id = request.args.get('list')
# Radio/Mix playlists (RD...) only work as watch page, not playlist page
if playlist_id.startswith('RD'):
first_video_id = playlist_id[2:] # video ID after 'RD' prefix
return flask.redirect(
util.URL_ORIGIN + '/watch?v=' + first_video_id + '&list=' + playlist_id,
302
)
page = request.args.get('page', '1')
if page == '1':
@@ -87,7 +112,7 @@ def get_playlist_page():
tasks = (
gevent.spawn(
playlist_first_page, playlist_id,
report_text="Retrieved playlist info", use_mobile=True
report_text="Retrieved playlist info"
),
gevent.spawn(get_videos, playlist_id, page)
)
@@ -106,7 +131,7 @@ def get_playlist_page():
for item in info.get('items', ()):
util.prefix_urls(item)
util.add_extra_html_info(item)
if 'id' in item:
if 'id' in item and not item.get('thumbnail'):
item['thumbnail'] = f"{settings.img_prefix}https://i.ytimg.com/vi/{item['id']}/hqdefault.jpg"
item['url'] += '&list=' + playlist_id

View File

@@ -113,12 +113,12 @@ def read_protobuf(data):
length = read_varint(data)
value = data.read(length)
elif wire_type == 3:
end_bytes = encode_varint((field_number << 3) | 4)
end_bytes = varint_encode((field_number << 3) | 4)
value = read_group(data, end_bytes)
elif wire_type == 5:
value = data.read(4)
else:
raise Exception("Unknown wire type: " + str(wire_type) + ", Tag: " + bytes_to_hex(succinct_encode(tag)) + ", at position " + str(data.tell()))
raise Exception("Unknown wire type: " + str(wire_type) + " at position " + str(data.tell()))
yield (wire_type, field_number, value)

View File

@@ -97,6 +97,7 @@ import re
import time
import json
import os
import traceback
import pprint

View File

@@ -20,6 +20,29 @@
// TODO: Call abort to cancel in-progress appends?
// Buffer sizes for different systems
const BUFFER_CONFIG = {
default: 50 * 10**6, // 50 megabytes
webOS: 20 * 10**6, // 20 megabytes WebOS (LG)
samsungTizen: 20 * 10**6, // 20 megabytes Samsung Tizen OS
androidTV: 30 * 10**6, // 30 megabytes Android TV
desktop: 50 * 10**6, // 50 megabytes PC/Mac
};
function detectSystem() {
const userAgent = navigator.userAgent.toLowerCase();
if (/webos|lg browser/i.test(userAgent)) {
return "webOS";
} else if (/tizen/i.test(userAgent)) {
return "samsungTizen";
} else if (/android tv|smart-tv/i.test(userAgent)) {
return "androidTV";
} else if (/firefox|chrome|safari|edge/i.test(userAgent)) {
return "desktop";
} else {
return "default";
}
}
function AVMerge(video, srcInfo, startTime){
this.audioSource = null;
@@ -164,6 +187,8 @@ AVMerge.prototype.printDebuggingInfo = function() {
}
function Stream(avMerge, source, startTime, avRatio) {
const selectedSystem = detectSystem();
let baseBufferTarget = BUFFER_CONFIG[selectedSystem] || BUFFER_CONFIG.default;
this.avMerge = avMerge;
this.video = avMerge.video;
this.url = source['url'];
@@ -173,10 +198,11 @@ function Stream(avMerge, source, startTime, avRatio) {
this.mimeCodec = source['mime_codec']
this.streamType = source['acodec'] ? 'audio' : 'video';
if (this.streamType == 'audio') {
this.bufferTarget = avRatio*50*10**6;
this.bufferTarget = avRatio * baseBufferTarget;
} else {
this.bufferTarget = 50*10**6; // 50 megabytes
this.bufferTarget = baseBufferTarget;
}
console.info(`Detected system: ${selectedSystem}. Applying bufferTarget of ${this.bufferTarget} bytes to ${this.streamType}.`);
this.initRange = source['init_range'];
this.indexRange = source['index_range'];

View File

@@ -114,3 +114,57 @@ function copyTextToClipboard(text) {
window.addEventListener('DOMContentLoaded', function() {
cur_track_idx = getDefaultTranscriptTrackIdx();
});
/**
* Thumbnail fallback handler
* Tries lower quality thumbnails when higher quality fails (404)
* Priority: hq720.jpg -> sddefault.jpg -> hqdefault.jpg -> mqdefault.jpg -> default.jpg
*/
function thumbnail_fallback(img) {
// Once src is set (image was loaded or attempted), always work with src
const src = img.src;
if (!src) return;
// Handle YouTube video thumbnails
if (src.includes('/i.ytimg.com/') || src.includes('/i.ytimg.com%2F')) {
// Extract video ID from URL
const match = src.match(/\/vi\/([^/]+)/);
if (!match) return;
const videoId = match[1];
const imgPrefix = settings_img_prefix || '';
// Define fallback order (from highest to lowest quality)
const fallbacks = [
'hq720.jpg',
'sddefault.jpg',
'hqdefault.jpg',
];
// Find current quality and try next fallback
for (let i = 0; i < fallbacks.length; i++) {
if (src.includes(fallbacks[i])) {
if (i < fallbacks.length - 1) {
img.src = imgPrefix + 'https://i.ytimg.com/vi/' + videoId + '/' + fallbacks[i + 1];
} else {
// Last fallback failed, stop retrying
img.onerror = null;
}
return;
}
}
// Unknown quality format, stop retrying
img.onerror = null;
}
// Handle YouTube channel avatars (ggpht.com)
else if (src.includes('ggpht.com') || src.includes('yt3.ggpht.com')) {
const newSrc = src.replace(/=s\d+-c-k/, '=s240-c-k-c0x00ffffff-no-rj');
if (newSrc !== src) {
img.src = newSrc;
} else {
img.onerror = null;
}
} else {
img.onerror = null;
}
}

View File

@@ -58,7 +58,7 @@
},
});
const player = new Plyr(document.getElementById('js-video-player'), {
const playerOptions = {
// Learning about autoplay permission https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Permissions-Policy/autoplay#syntax
autoplay: autoplayActive,
disableContextMenu: false,
@@ -117,5 +117,20 @@
tooltips: {
controls: true,
},
}
const player = new Plyr(document.getElementById('js-video-player'), playerOptions);
// disable double click to fullscreen
// https://github.com/sampotts/plyr/issues/1370#issuecomment-528966795
player.eventListeners.forEach(function(eventListener) {
if(eventListener.type === 'dblclick') {
eventListener.element.removeEventListener(eventListener.type, eventListener.callback, eventListener.options);
}
});
// Add .started property, true after the playback has been started
// Needed so controls won't be hidden before playback has started
player.started = false;
player.once('playing', function(){this.started = true});
})();

View File

@@ -5,8 +5,9 @@ function changeQuality(selection) {
let videoPaused = video.paused;
let videoSpeed = video.playbackRate;
let srcInfo;
if (avMerge)
if (avMerge && typeof avMerge.close === 'function') {
avMerge.close();
}
if (selection.type == 'uni'){
srcInfo = data['uni_sources'][selection.index];
video.src = srcInfo.url;

View File

@@ -37,3 +37,41 @@ e.g. Firefox playback speed options */
max-height: 320px;
overflow-y: auto;
}
/*
* Custom styles similar to youtube
*/
.plyr__controls {
display: flex;
justify-content: center;
}
.plyr__progress__container {
position: absolute;
bottom: 0;
width: 100%;
margin-bottom: -10px;
}
.plyr__controls .plyr__controls__item:first-child {
margin-left: 0;
margin-right: 0;
z-index: 5;
}
.plyr__controls .plyr__controls__item.plyr__volume {
margin-left: auto;
}
.plyr__controls .plyr__controls__item.plyr__progress__container {
padding-left: 10px;
padding-right: 10px;
}
.plyr__progress input[type="range"] {
margin-bottom: 50px;
}
/*
* End custom styles
*/

View File

@@ -128,6 +128,29 @@ header {
background-color: var(--buttom-hover);
}
.live-url-choices {
background-color: var(--thumb-background);
margin: 1rem 0;
padding: 1rem;
}
.playability-error {
position: relative;
box-sizing: border-box;
height: 30vh;
margin: 1rem 0;
}
.playability-error > span {
display: flex;
background-color: var(--thumb-background);
height: 100%;
object-fit: cover;
justify-content: center;
align-items: center;
text-align: center;
}
.playlist {
display: grid;
grid-gap: 4px;
@@ -622,6 +645,9 @@ figure.sc-video {
max-height: 80vh;
overflow-y: scroll;
}
.playability-error {
height: 60vh;
}
.playlist {
display: grid;
grid-gap: 1px;

View File

@@ -30,8 +30,7 @@ database_path = os.path.join(settings.data_dir, "subscriptions.sqlite")
def open_database():
if not os.path.exists(settings.data_dir):
os.makedirs(settings.data_dir)
os.makedirs(settings.data_dir, exist_ok=True)
connection = sqlite3.connect(database_path, check_same_thread=False)
try:
@@ -1089,12 +1088,26 @@ def serve_subscription_thumbnail(thumbnail):
f.close()
return flask.Response(image, mimetype='image/jpeg')
url = f"https://i.ytimg.com/vi/{video_id}/hqdefault.jpg"
try:
image = util.fetch_url(url, report_text="Saved thumbnail: " + video_id)
except urllib.error.HTTPError as e:
print("Failed to download thumbnail for " + video_id + ": " + str(e))
abort(e.code)
image = None
for quality in ('hq720.jpg', 'sddefault.jpg', 'hqdefault.jpg'):
url = f"https://i.ytimg.com/vi/{video_id}/{quality}"
try:
image = util.fetch_url(url, report_text="Saved thumbnail: " + video_id)
break
except util.FetchError as e:
if '404' in str(e):
continue
print("Failed to download thumbnail for " + video_id + ": " + str(e))
flask.abort(500)
except urllib.error.HTTPError as e:
if e.code == 404:
continue
print("Failed to download thumbnail for " + video_id + ": " + str(e))
flask.abort(e.code)
if image is None:
flask.abort(404)
try:
f = open(thumbnail_path, 'wb')
except FileNotFoundError:

View File

@@ -26,6 +26,12 @@
// @license-end
</script>
{% endif %}
<script>
// @license magnet:?xt=urn:btih:0b31508aeb0634b347b8270c7bee4d411b5d4109&dn=agpl-3.0.txt AGPL-v3-or-Later
// Image prefix for thumbnails
let settings_img_prefix = "{{ settings.img_prefix or '' }}";
// @license-end
</script>
</head>
<body>
@@ -35,57 +41,57 @@
</nav>
<form class="form" id="site-search" action="/youtube.com/results">
<input type="search" name="search_query" class="search-box" value="{{ search_box_value }}"
{{ "autofocus" if (request.path in ("/", "/results") or error_message) else "" }} required placeholder="Type to search...">
<button type="submit" value="Search" class="search-button">Search</button>
{{ "autofocus" if (request.path in ("/", "/results") or error_message) else "" }} required placeholder="{{ _('Type to search...') }}">
<button type="submit" value="Search" class="search-button">{{ _('Search') }}</button>
<!-- options -->
<div class="dropdown">
<!-- hidden box -->
<input id="options-toggle-cbox" class="opt-box" type="checkbox">
<!-- end hidden box -->
<label class="dropdown-label" for="options-toggle-cbox">Options</label>
<label class="dropdown-label" for="options-toggle-cbox">{{ _('Options') }}</label>
<div class="dropdown-content">
<h3>Sort by</h3>
<h3>{{ _('Sort by') }}</h3>
<div class="option">
<input type="radio" id="sort_relevance" name="sort" value="0">
<label for="sort_relevance">Relevance</label>
<label for="sort_relevance">{{ _('Relevance') }}</label>
</div>
<div class="option">
<input type="radio" id="sort_upload_date" name="sort" value="2">
<label for="sort_upload_date">Upload date</label>
<label for="sort_upload_date">{{ _('Upload date') }}</label>
</div>
<div class="option">
<input type="radio" id="sort_view_count" name="sort" value="3">
<label for="sort_view_count">View count</label>
<label for="sort_view_count">{{ _('View count') }}</label>
</div>
<div class="option">
<input type="radio" id="sort_rating" name="sort" value="1">
<label for="sort_rating">Rating</label>
<label for="sort_rating">{{ _('Rating') }}</label>
</div>
<h3>Upload date</h3>
<h3>{{ _('Upload date') }}</h3>
<div class="option">
<input type="radio" id="time_any" name="time" value="0">
<label for="time_any">Any</label>
<label for="time_any">{{ _('Any') }}</label>
</div>
<div class="option">
<input type="radio" id="time_last_hour" name="time" value="1">
<label for="time_last_hour">Last hour</label>
<label for="time_last_hour">{{ _('Last hour') }}</label>
</div>
<div class="option">
<input type="radio" id="time_today" name="time" value="2">
<label for="time_today">Today</label>
<label for="time_today">{{ _('Today') }}</label>
</div>
<div class="option">
<input type="radio" id="time_this_week" name="time" value="3">
<label for="time_this_week">This week</label>
<label for="time_this_week">{{ _('This week') }}</label>
</div>
<div class="option">
<input type="radio" id="time_this_month" name="time" value="4">
<label for="time_this_month">This month</label>
<label for="time_this_month">{{ _('This month') }}</label>
</div>
<div class="option">
<input type="radio" id="time_this_year" name="time" value="5">
<label for="time_this_year">This year</label>
<label for="time_this_year">{{ _('This year') }}</label>
</div>
<h3>Type</h3>

View File

@@ -81,10 +81,10 @@
<!-- new-->
<div id="links-metadata">
{% if current_tab in ('videos', 'shorts', 'streams') %}
{% set sorts = [('1', 'views'), ('2', 'oldest'), ('3', 'newest'), ('4', 'newest - no shorts'),] %}
{% set sorts = [('3', 'newest'), ('4', 'newest - no shorts')] %}
<div id="number-of-results">{{ number_of_videos }} videos</div>
{% elif current_tab == 'playlists' %}
{% set sorts = [('2', 'oldest'), ('3', 'newest'), ('4', 'last video added')] %}
{% set sorts = [('3', 'newest'), ('4', 'last video added')] %}
{% if items %}
<h2 class="page-number">Page {{ page_number }}</h2>
{% else %}

View File

@@ -3,13 +3,13 @@
{% macro render_comment(comment, include_avatar, timestamp_links=False) %}
<div class="comment-container">
<div class="comment">
<a class="author-avatar" href="{{ comment['author_url'] }}" title="{{ comment['author'] }}">
<a class="author-avatar" href="{{ comment['author_url'] or '#' }}" title="{{ comment['author'] }}">
{% if include_avatar %}
<img class="author-avatar-img" alt="{{ comment['author'] }}" src="{{ comment['author_avatar'] }}">
{% endif %}
</a>
<address class="author-name">
<a class="author" href="{{ comment['author_url'] }}" title="{{ comment['author'] }}">{{ comment['author'] }}</a>
<a class="author" href="{{ comment['author_url'] or '#' }}" title="{{ comment['author'] }}">{{ comment['author'] }}</a>
</address>
<a class="permalink" href="{{ comment['permalink'] }}" title="permalink">
<span>{{ comment['time_published'] }}</span>

View File

@@ -20,14 +20,14 @@
{{ info['error'] }}
{% else %}
<div class="item-video {{ info['type'] + '-item' }}">
<a class="thumbnail-box" href="{{ info['url'] }}" title="{{ info['title'] }}">
<a class="thumbnail-box" href="{{ info['url'] or '#' }}" title="{{ info['title'] }}">
<div class="thumbnail {% if info['type'] == 'channel' %} channel {% endif %}">
{% if lazy_load %}
<img class="thumbnail-img lazy" alt="&#x20;" data-src="{{ info['thumbnail'] }}">
<img class="thumbnail-img lazy" alt="&#x20;" data-src="{{ info['thumbnail'] }}" onerror="thumbnail_fallback(this)">
{% elif info['type'] == 'channel' %}
<img class="thumbnail-img channel" alt="&#x20;" src="{{ info['thumbnail'] }}">
<img class="thumbnail-img channel" alt="&#x20;" src="{{ info['thumbnail'] }}" onerror="thumbnail_fallback(this)">
{% else %}
<img class="thumbnail-img" alt="&#x20;" src="{{ info['thumbnail'] }}">
<img class="thumbnail-img" alt="&#x20;" src="{{ info['thumbnail'] }}" onerror="thumbnail_fallback(this)">
{% endif %}
{% if info['type'] != 'channel' %}
@@ -35,7 +35,7 @@
{% endif %}
</div>
</a>
<h4 class="title"><a href="{{ info['url'] }}" title="{{ info['title'] }}">{{ info['title'] }}</a></h4>
<h4 class="title"><a href="{{ info['url'] or '#' }}" title="{{ info['title'] }}">{{ info['title'] }}</a></h4>
{% if include_author %}
{% set author_description = info['author'] %}
@@ -58,7 +58,9 @@
<div class="stats {{'horizontal-stats' if horizontal else 'vertical-stats'}}">
{% if info['type'] == 'channel' %}
<div>{{ info['approx_subscriber_count'] }} subscribers</div>
{% if info.get('approx_subscriber_count') %}
<div>{{ info['approx_subscriber_count'] }} subscribers</div>
{% endif %}
<div>{{ info['video_count']|commatize }} videos</div>
{% else %}
{% if info.get('time_published') %}

View File

@@ -10,11 +10,17 @@
<div class="playlist-metadata">
<div class="author">
{% if thumbnail %}
<img alt="{{ title }}" src="{{ thumbnail }}">
{% endif %}
<h2>{{ title }}</h2>
</div>
<div class="summary">
{% if author_url %}
<a class="playlist-author" href="{{ author_url }}">{{ author }}</a>
{% else %}
<span class="playlist-author">{{ author }}</span>
{% endif %}
</div>
<div class="playlist-stats">
<div>{{ video_count|commatize }} videos</div>

View File

@@ -31,11 +31,19 @@
<input type="number" id="{{ 'setting_' + setting_name }}" name="{{ setting_name }}" value="{{ value }}" step="1">
{% endif %}
{% elif setting_info['type'].__name__ == 'float' %}
<input type="number" id="{{ 'setting_' + setting_name }}" name="{{ setting_name }}" value="{{ value }}" step="0.01">
{% elif setting_info['type'].__name__ == 'str' %}
<input type="text" id="{{ 'setting_' + setting_name }}" name="{{ setting_name }}" value="{{ value }}">
{% if 'options' is in(setting_info) %}
<select id="{{ 'setting_' + setting_name }}" name="{{ setting_name }}">
{% for option in setting_info['options'] %}
<option value="{{ option[0] }}" {{ 'selected' if option[0] == value else '' }}>{{ option[1] }}</option>
{% endfor %}
</select>
{% else %}
<input type="text" id="{{ 'setting_' + setting_name }}" name="{{ setting_name }}" value="{{ value }}">
{% endif %}
{% else %}
<span>Error: Unknown setting type: setting_info['type'].__name__</span>
<span>Error: Unknown setting type: {{ setting_info['type'].__name__ }}</span>
{% endif %}
</li>
{% endif %}

View File

@@ -85,6 +85,7 @@
<option value='{"type": "pair", "index": {{ loop.index0}}}' {{ 'selected' if loop.index0 == pair_idx and using_pair_sources else '' }} >{{ src_pair['quality_string'] }}</option>
{% endfor %}
</select>
{% endif %}
</div>
<input class="v-checkbox" name="video_info_list" value="{{ video_info }}" form="playlist-edit" type="checkbox">
@@ -171,7 +172,11 @@
{% else %}
<li>{{ playlist['current_index']+1 }}/{{ playlist['video_count'] }}</li>
{% endif %}
{% if playlist['author_url'] %}
<li><a href="{{ playlist['author_url'] }}" title="{{ playlist['author'] }}">{{ playlist['author'] }}</a></li>
{% elif playlist['author'] %}
<li>{{ playlist['author'] }}</li>
{% endif %}
</ul>
</div>
<nav class="playlist-videos">
@@ -246,6 +251,7 @@
let storyboard_url = {{ storyboard_url | tojson }};
// @license-end
</script>
<script src="/youtube.com/static/js/common.js"></script>
<script src="/youtube.com/static/js/transcript-table.js"></script>
{% if settings.use_video_player == 2 %}

View File

@@ -1,4 +1,5 @@
from datetime import datetime
import logging
import settings
import socks
import sockshandler
@@ -18,6 +19,8 @@ import gevent.queue
import gevent.lock
import collections
import stem
logger = logging.getLogger(__name__)
import stem.control
import traceback
@@ -302,73 +305,140 @@ def fetch_url_response(url, headers=(), timeout=15, data=None,
def fetch_url(url, headers=(), timeout=15, report_text=None, data=None,
cookiejar_send=None, cookiejar_receive=None, use_tor=True,
debug_name=None):
while True:
start_time = time.monotonic()
"""
Fetch URL with exponential backoff retry logic for rate limiting.
response, cleanup_func = fetch_url_response(
url, headers, timeout=timeout, data=data,
cookiejar_send=cookiejar_send, cookiejar_receive=cookiejar_receive,
use_tor=use_tor)
response_time = time.monotonic()
Retries:
- 429 Too Many Requests: Exponential backoff (1s, 2s, 4s, 8s, 16s)
- 503 Service Unavailable: Exponential backoff
- 302 Redirect to Google Sorry: Treated as rate limit
content = response.read()
Max retries: 5 attempts with exponential backoff
"""
import random
read_finish = time.monotonic()
max_retries = 5
base_delay = 1.0 # Base delay in seconds
cleanup_func(response) # release_connection for urllib3
content = decode_content(
content,
response.headers.get('Content-Encoding', default='identity'))
for attempt in range(max_retries):
try:
start_time = time.monotonic()
if (settings.debugging_save_responses
and debug_name is not None
and content):
save_dir = os.path.join(settings.data_dir, 'debug')
if not os.path.exists(save_dir):
os.makedirs(save_dir)
response, cleanup_func = fetch_url_response(
url, headers, timeout=timeout, data=data,
cookiejar_send=cookiejar_send, cookiejar_receive=cookiejar_receive,
use_tor=use_tor)
response_time = time.monotonic()
with open(os.path.join(save_dir, debug_name), 'wb') as f:
f.write(content)
content = response.read()
if response.status == 429 or (
response.status == 302 and (response.getheader('Location') == url
or response.getheader('Location').startswith(
'https://www.google.com/sorry/index'
)
)
):
print(response.status, response.reason, response.headers)
ip = re.search(
br'IP address: ((?:[\da-f]*:)+[\da-f]+|(?:\d+\.)+\d+)',
content)
ip = ip.group(1).decode('ascii') if ip else None
if not ip:
ip = re.search(r'IP=((?:\d+\.)+\d+)',
response.getheader('Set-Cookie') or '')
ip = ip.group(1) if ip else None
read_finish = time.monotonic()
# don't get new identity if we're not using Tor
if not use_tor:
raise FetchError('429', reason=response.reason, ip=ip)
cleanup_func(response) # release_connection for urllib3
content = decode_content(
content,
response.headers.get('Content-Encoding', default='identity'))
print('Error: YouTube blocked the request because the Tor exit node is overutilized. Exit node IP address: %s' % ip)
if (settings.debugging_save_responses
and debug_name is not None
and content):
save_dir = os.path.join(settings.data_dir, 'debug')
os.makedirs(save_dir, exist_ok=True)
# get new identity
error = tor_manager.new_identity(start_time)
if error:
raise FetchError(
'429', reason=response.reason, ip=ip,
error_message='Automatic circuit change: ' + error)
else:
continue # retry now that we have new identity
with open(os.path.join(save_dir, debug_name), 'wb') as f:
f.write(content)
elif response.status >= 400:
raise FetchError(str(response.status), reason=response.reason,
ip=None)
break
# Check for rate limiting (429) or redirect to Google Sorry
if response.status == 429 or (
response.status == 302 and (response.getheader('Location') == url
or response.getheader('Location').startswith(
'https://www.google.com/sorry/index'
)
)
):
logger.info(f'Rate limit response: {response.status} {response.reason}')
ip = re.search(
br'IP address: ((?:[\da-f]*:)+[\da-f]+|(?:\d+\.)+\d+)',
content)
ip = ip.group(1).decode('ascii') if ip else None
if not ip:
ip = re.search(r'IP=((?:\d+\.)+\d+)',
response.getheader('Set-Cookie') or '')
ip = ip.group(1) if ip else None
# Without Tor, no point retrying with same IP
if not use_tor or not settings.route_tor:
logger.warning('Rate limited (429). Enable Tor routing to retry with new IP.')
raise FetchError('429', reason=response.reason, ip=ip)
# Tor: exhausted retries
if attempt >= max_retries - 1:
logger.error(f'Rate limited after {max_retries} retries. Exit IP: {ip}')
raise FetchError('429', reason=response.reason, ip=ip,
error_message='Tor exit node overutilized after multiple retries')
# Tor: get new identity and retry
logger.info(f'Rate limited. Getting new Tor identity... (IP: {ip})')
error = tor_manager.new_identity(start_time)
if error:
raise FetchError(
'429', reason=response.reason, ip=ip,
error_message='Automatic circuit change: ' + error)
continue # retry with new identity
# Check for client errors (400, 404) - don't retry these
if response.status == 400:
logger.error(f'Bad Request (400) - Invalid parameters or URL: {url[:100]}')
raise FetchError('400', reason='Bad Request - Invalid parameters or URL format', ip=None)
if response.status == 404:
logger.warning(f'Not Found (404): {url[:100]}')
raise FetchError('404', reason='Not Found', ip=None)
# Check for other server errors (503, 502, 504)
if response.status in (502, 503, 504):
if attempt >= max_retries - 1:
logger.error(f'Server error {response.status} after {max_retries} retries')
raise FetchError(str(response.status), reason=response.reason, ip=None)
# Exponential backoff for server errors
delay = (base_delay * (2 ** attempt)) + random.uniform(0, 1)
logger.warning(f'Server error ({response.status}). Waiting {delay:.1f}s before retry {attempt + 1}/{max_retries}...')
time.sleep(delay)
continue
# Success - break out of retry loop
break
except urllib3.exceptions.MaxRetryError as e:
# If this is the last attempt, raise the error
if attempt >= max_retries - 1:
exception_cause = e.__context__.__context__
if (isinstance(exception_cause, socks.ProxyConnectionError)
and settings.route_tor):
msg = ('Failed to connect to Tor. Check that Tor is open and '
'that your internet connection is working.\n\n'
+ str(e))
logger.error(f'Tor connection failed: {msg}')
raise FetchError('502', reason='Bad Gateway',
error_message=msg)
elif isinstance(e.__context__,
urllib3.exceptions.NewConnectionError):
msg = 'Failed to establish a connection.\n\n' + str(e)
logger.error(f'Connection failed: {msg}')
raise FetchError(
'502', reason='Bad Gateway',
error_message=msg)
else:
raise
# Wait and retry
delay = (base_delay * (2 ** attempt)) + random.uniform(0, 1)
logger.warning(f'Connection error. Waiting {delay:.1f}s before retry {attempt + 1}/{max_retries}...')
time.sleep(delay)
if report_text:
print(report_text, ' Latency:', round(response_time - start_time, 3), ' Read time:', round(read_finish - response_time,3))
logger.info(f'{report_text} - Latency: {round(response_time - start_time, 3)}s - Read time: {round(read_finish - response_time, 3)}s')
return content
@@ -462,21 +532,31 @@ class RateLimitedQueue(gevent.queue.Queue):
def download_thumbnail(save_directory, video_id):
url = f"https://i.ytimg.com/vi/{video_id}/hqdefault.jpg"
save_location = os.path.join(save_directory, video_id + ".jpg")
try:
thumbnail = fetch_url(url, report_text="Saved thumbnail: " + video_id)
except urllib.error.HTTPError as e:
print("Failed to download thumbnail for " + video_id + ": " + str(e))
return False
try:
f = open(save_location, 'wb')
except FileNotFoundError:
os.makedirs(save_directory, exist_ok=True)
f = open(save_location, 'wb')
f.write(thumbnail)
f.close()
return True
for quality in ('hq720.jpg', 'sddefault.jpg', 'hqdefault.jpg'):
url = f"https://i.ytimg.com/vi/{video_id}/{quality}"
try:
thumbnail = fetch_url(url, report_text="Saved thumbnail: " + video_id)
except FetchError as e:
if '404' in str(e):
continue
print("Failed to download thumbnail for " + video_id + ": " + str(e))
return False
except urllib.error.HTTPError as e:
if e.code == 404:
continue
print("Failed to download thumbnail for " + video_id + ": " + str(e))
return False
try:
f = open(save_location, 'wb')
except FileNotFoundError:
os.makedirs(save_directory, exist_ok=True)
f = open(save_location, 'wb')
f.write(thumbnail)
f.close()
return True
print("No thumbnail available for " + video_id)
return False
def download_thumbnails(save_directory, ids):
@@ -502,9 +582,40 @@ def video_id(url):
return urllib.parse.parse_qs(url_parts.query)['v'][0]
# default, sddefault, mqdefault, hqdefault, hq720
def get_thumbnail_url(video_id):
return f"{settings.img_prefix}https://i.ytimg.com/vi/{video_id}/hqdefault.jpg"
def get_thumbnail_url(video_id, quality='hq720'):
"""Get thumbnail URL with fallback to lower quality if needed.
Args:
video_id: YouTube video ID
quality: Preferred quality ('maxres', 'hq720', 'sd', 'hq', 'mq', 'default')
Returns:
Tuple of (best_available_url, quality_used)
"""
# Quality priority order (highest to lowest)
quality_order = {
'maxres': ['maxresdefault.jpg', 'sddefault.jpg', 'hqdefault.jpg'],
'hq720': ['hq720.jpg', 'sddefault.jpg', 'hqdefault.jpg'],
'sd': ['sddefault.jpg', 'hqdefault.jpg'],
'hq': ['hqdefault.jpg', 'mqdefault.jpg'],
'mq': ['mqdefault.jpg', 'default.jpg'],
'default': ['default.jpg'],
}
qualities = quality_order.get(quality, quality_order['hq720'])
base_url = f"{settings.img_prefix}https://i.ytimg.com/vi/{video_id}/"
# For now, return the highest quality URL
# The browser will handle 404s gracefully with alt text
return base_url + qualities[0], qualities[0]
def get_best_thumbnail_url(video_id):
"""Get the best available thumbnail URL for a video.
Tries hq720 first (for HD videos), falls back to sddefault for SD videos.
"""
return get_thumbnail_url(video_id, quality='hq720')[0]
def seconds_to_timestamp(seconds):
@@ -538,6 +649,12 @@ def prefix_url(url):
if url is None:
return None
url = url.lstrip('/') # some urls have // before them, which has a special meaning
# Increase resolution for YouTube channel avatars
if url and ('ggpht.com' in url or 'yt3.ggpht.com' in url):
# Replace size parameter with higher resolution (s240 instead of s88)
url = re.sub(r'=s\d+-c-k', '=s240-c-k-c0x00ffffff-no-rj', url)
return '/' + url
@@ -720,9 +837,12 @@ INNERTUBE_CLIENTS = {
'hl': 'en',
'gl': 'US',
'clientName': 'IOS',
'clientVersion': '19.09.3',
'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtube/19.09.3 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
'clientVersion': '21.03.2',
'deviceMake': 'Apple',
'deviceModel': 'iPhone16,2',
'osName': 'iPhone',
'osVersion': '18.7.2.22H124',
'userAgent': 'com.google.ios.youtube/21.03.2 (iPhone16,2; U; CPU iOS 18_7_2 like Mac OS X)'
}
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 5,
@@ -784,8 +904,7 @@ INNERTUBE_CLIENTS = {
def get_visitor_data():
visitor_data = None
visitor_data_cache = os.path.join(settings.data_dir, 'visitorData.txt')
if not os.path.exists(settings.data_dir):
os.makedirs(settings.data_dir)
os.makedirs(settings.data_dir, exist_ok=True)
if os.path.isfile(visitor_data_cache):
with open(visitor_data_cache, 'r') as file:
print('Getting visitor_data from cache')
@@ -840,6 +959,8 @@ def call_youtube_api(client, api, data):
def strip_non_ascii(string):
''' Returns the string without non ASCII characters'''
if string is None:
return ""
stripped = (c for c in string if 0 < ord(c) < 127)
return ''.join(stripped)

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '0.2.21'
__version__ = 'v0.4.5'

View File

@@ -6,6 +6,9 @@ import settings
from flask import request
import flask
import logging
logger = logging.getLogger(__name__)
import json
import gevent
@@ -177,8 +180,34 @@ def make_caption_src(info, lang, auto=False, trans_lang=None):
label += ' (Automatic)'
if trans_lang:
label += ' -> ' + trans_lang
# Try to use Android caption URL directly (no PO Token needed)
caption_url = None
for track in info.get('_android_caption_tracks', []):
track_lang = track.get('languageCode', '')
track_kind = track.get('kind', '')
if track_lang == lang and (
(auto and track_kind == 'asr') or
(not auto and track_kind != 'asr')
):
caption_url = track.get('baseUrl')
break
if caption_url:
# Add format
if '&fmt=' in caption_url:
caption_url = re.sub(r'&fmt=[^&]*', '&fmt=vtt', caption_url)
else:
caption_url += '&fmt=vtt'
if trans_lang:
caption_url += '&tlang=' + trans_lang
url = util.prefix_url(caption_url)
else:
# Fallback to old method
url = util.prefix_url(yt_data_extract.get_caption_url(info, lang, 'vtt', auto, trans_lang))
return {
'url': util.prefix_url(yt_data_extract.get_caption_url(info, lang, 'vtt', auto, trans_lang)),
'url': url,
'label': label,
'srclang': trans_lang[0:2] if trans_lang else lang[0:2],
'on': False,
@@ -300,11 +329,8 @@ def get_ordered_music_list_attributes(music_list):
def save_decrypt_cache():
try:
f = open(os.path.join(settings.data_dir, 'decrypt_function_cache.json'), 'w')
except FileNotFoundError:
os.makedirs(settings.data_dir)
f = open(os.path.join(settings.data_dir, 'decrypt_function_cache.json'), 'w')
os.makedirs(settings.data_dir, exist_ok=True)
f = open(os.path.join(settings.data_dir, 'decrypt_function_cache.json'), 'w')
f.write(json.dumps({'version': 1, 'decrypt_cache':decrypt_cache}, indent=4, sort_keys=True))
f.close()
@@ -367,32 +393,61 @@ def fetch_watch_page_info(video_id, playlist_id, index):
watch_page = watch_page.decode('utf-8')
return yt_data_extract.extract_watch_info_from_html(watch_page)
def extract_info(video_id, use_invidious, playlist_id=None, index=None):
primary_client = 'android_vr'
fallback_client = 'ios'
last_resort_client = 'tv_embedded'
tasks = (
# Get video metadata from here
gevent.spawn(fetch_watch_page_info, video_id, playlist_id, index),
gevent.spawn(fetch_player_response, 'android_vr', video_id)
gevent.spawn(fetch_player_response, primary_client, video_id)
)
gevent.joinall(tasks)
util.check_gevent_exceptions(*tasks)
info, player_response = tasks[0].value, tasks[1].value
info = tasks[0].value or {}
player_response = tasks[1].value or {}
# Save android_vr caption tracks (no PO Token needed for these URLs)
if isinstance(player_response, str):
try:
pr_data = json.loads(player_response)
except Exception:
pr_data = {}
else:
pr_data = player_response or {}
android_caption_tracks = yt_data_extract.deep_get(
pr_data, 'captions', 'playerCaptionsTracklistRenderer',
'captionTracks', default=[])
info['_android_caption_tracks'] = android_caption_tracks
yt_data_extract.update_with_new_urls(info, player_response)
# Age restricted video, retry
if info['age_restricted'] or info['player_urls_missing']:
if info['age_restricted']:
print('Age restricted video, retrying')
else:
print('Player urls missing, retrying')
player_response = fetch_player_response('tv_embedded', video_id)
yt_data_extract.update_with_new_urls(info, player_response)
# Fallback to 'ios' if no valid URLs are found
if not info.get('formats') or info.get('player_urls_missing'):
print(f"No URLs found in '{primary_client}', attempting with '{fallback_client}'.")
try:
player_response = fetch_player_response(fallback_client, video_id) or {}
yt_data_extract.update_with_new_urls(info, player_response)
except util.FetchError as e:
print(f"Fallback '{fallback_client}' failed: {e}")
# Final attempt with 'tv_embedded' if there are still no URLs
if not info.get('formats') or info.get('player_urls_missing'):
print(f"No URLs found in '{fallback_client}', attempting with '{last_resort_client}'")
try:
player_response = fetch_player_response(last_resort_client, video_id) or {}
yt_data_extract.update_with_new_urls(info, player_response)
except util.FetchError as e:
print(f"Fallback '{last_resort_client}' failed: {e}")
# signature decryption
decryption_error = decrypt_signatures(info, video_id)
if decryption_error:
decryption_error = 'Error decrypting url signatures: ' + decryption_error
info['playability_error'] = decryption_error
if info.get('formats'):
decryption_error = decrypt_signatures(info, video_id)
if decryption_error:
info['playability_error'] = 'Error decrypting url signatures: ' + decryption_error
# check if urls ready (non-live format) in former livestream
# urls not ready if all of them have no filesize
@@ -406,21 +461,21 @@ def extract_info(video_id, use_invidious, playlist_id=None, index=None):
# livestream urls
# sometimes only the livestream urls work soon after the livestream is over
if (info['hls_manifest_url']
and (info['live'] or not info['formats'] or not info['urls_ready'])
):
manifest = util.fetch_url(info['hls_manifest_url'],
debug_name='hls_manifest.m3u8',
report_text='Fetched hls manifest'
).decode('utf-8')
info['hls_formats'], err = yt_data_extract.extract_hls_formats(manifest)
if not err:
info['playability_error'] = None
for fmt in info['hls_formats']:
fmt['video_quality'] = video_quality_string(fmt)
else:
info['hls_formats'] = []
info['hls_formats'] = []
if info.get('hls_manifest_url') and (info.get('live') or not info.get('formats') or not info['urls_ready']):
try:
manifest = util.fetch_url(info['hls_manifest_url'],
debug_name='hls_manifest.m3u8',
report_text='Fetched hls manifest'
).decode('utf-8')
info['hls_formats'], err = yt_data_extract.extract_hls_formats(manifest)
if not err:
info['playability_error'] = None
for fmt in info['hls_formats']:
fmt['video_quality'] = video_quality_string(fmt)
except Exception as e:
print(f"Error obteniendo HLS manifest: {e}")
info['hls_formats'] = []
# check for 403. Unnecessary for tor video routing b/c ip address is same
info['invidious_used'] = False
@@ -615,7 +670,12 @@ def get_watch_page(video_id=None):
# prefix urls, and other post-processing not handled by yt_data_extract
for item in info['related_videos']:
item['thumbnail'] = "https://i.ytimg.com/vi/{}/hqdefault.jpg".format(item['id']) # set HQ relateds thumbnail videos
# Only set thumbnail if YouTube didn't provide one
if not item.get('thumbnail'):
if item.get('type') == 'playlist' and item.get('first_video_id'):
item['thumbnail'] = "https://i.ytimg.com/vi/{}/hqdefault.jpg".format(item['first_video_id'])
elif item.get('type') == 'video' and item.get('id'):
item['thumbnail'] = "https://i.ytimg.com/vi/{}/hqdefault.jpg".format(item['id'])
util.prefix_urls(item)
util.add_extra_html_info(item)
for song in info['music_list']:
@@ -623,6 +683,9 @@ def get_watch_page(video_id=None):
if info['playlist']:
playlist_id = info['playlist']['id']
for item in info['playlist']['items']:
# Only set thumbnail if YouTube didn't provide one
if not item.get('thumbnail') and item.get('type') == 'video' and item.get('id'):
item['thumbnail'] = "https://i.ytimg.com/vi/{}/hqdefault.jpg".format(item['id'])
util.prefix_urls(item)
util.add_extra_html_info(item)
if playlist_id:
@@ -807,9 +870,14 @@ def get_watch_page(video_id=None):
@yt_app.route('/api/<path:dummy>')
def get_captions(dummy):
result = util.fetch_url('https://www.youtube.com' + request.full_path)
result = result.replace(b"align:start position:0%", b"")
return result
url = 'https://www.youtube.com' + request.full_path
try:
result = util.fetch_url(url, headers=util.mobile_ua)
result = result.replace(b"align:start position:0%", b"")
return flask.Response(result, mimetype='text/vtt')
except Exception as e:
logger.debug(f'Caption fetch failed: {e}')
return flask.Response(b'WEBVTT\n\n', mimetype='text/vtt', status=200)
times_reg = re.compile(r'^\d\d:\d\d:\d\d\.\d\d\d --> \d\d:\d\d:\d\d\.\d\d\d.*$')

View File

@@ -226,6 +226,190 @@ def check_missing_keys(object, *key_sequences):
return None
def extract_lockup_view_model_info(item, additional_info={}):
"""Extract info from new lockupViewModel format (YouTube 2024+)"""
info = {'error': None}
content_type = item.get('contentType', '')
content_id = item.get('contentId', '')
# Extract title from metadata
metadata = item.get('metadata', {})
lockup_metadata = metadata.get('lockupMetadataViewModel', {})
title_data = lockup_metadata.get('title', {})
info['title'] = title_data.get('content', '')
# Determine type based on contentType
if 'PLAYLIST' in content_type or 'PODCAST' in content_type:
info['type'] = 'playlist'
info['playlist_type'] = 'playlist'
info['id'] = content_id
info['video_count'] = None
info['first_video_id'] = None
# Try to get video count from metadata
metadata_rows = lockup_metadata.get('metadata', {})
for row in metadata_rows.get('contentMetadataViewModel', {}).get('metadataRows', []):
for part in row.get('metadataParts', []):
text = part.get('text', {}).get('content', '')
if 'video' in text.lower() or 'episode' in text.lower():
info['video_count'] = extract_int(text)
elif 'VIDEO' in content_type:
info['type'] = 'video'
info['id'] = content_id
info['view_count'] = None
info['approx_view_count'] = None
info['time_published'] = None
info['duration'] = None
# Extract duration/other info from metadata rows
metadata_rows = lockup_metadata.get('metadata', {})
for row in metadata_rows.get('contentMetadataViewModel', {}).get('metadataRows', []):
for part in row.get('metadataParts', []):
text = part.get('text', {}).get('content', '')
if 'view' in text.lower():
info['approx_view_count'] = extract_approx_int(text)
elif 'ago' in text.lower():
info['time_published'] = text
elif 'CHANNEL' in content_type:
info['type'] = 'channel'
info['id'] = content_id
info['approx_subscriber_count'] = None
info['video_count'] = None
# Extract subscriber count and video count from metadata rows
metadata_rows = lockup_metadata.get('metadata', {})
for row in metadata_rows.get('contentMetadataViewModel', {}).get('metadataRows', []):
for part in row.get('metadataParts', []):
text = part.get('text', {}).get('content', '')
if 'subscriber' in text.lower():
info['approx_subscriber_count'] = extract_approx_int(text)
elif 'video' in text.lower():
info['video_count'] = extract_int(text)
else:
info['type'] = 'unsupported'
return info
# Extract thumbnail from contentImage
content_image = item.get('contentImage', {})
info['thumbnail'] = normalize_url(multi_deep_get(content_image,
# playlists with collection thumbnail
['collectionThumbnailViewModel', 'primaryThumbnail', 'thumbnailViewModel', 'image', 'sources', 0, 'url'],
# single thumbnail (some playlists, videos)
['thumbnailViewModel', 'image', 'sources', 0, 'url'],
)) or ''
# Extract video/episode count from thumbnail overlay badges
# (podcasts and some playlists put the count here instead of metadata rows)
thumb_vm = multi_deep_get(content_image,
['collectionThumbnailViewModel', 'primaryThumbnail', 'thumbnailViewModel'],
['thumbnailViewModel'],
) or {}
for overlay in thumb_vm.get('overlays', []):
for badge in deep_get(overlay, 'thumbnailOverlayBadgeViewModel', 'thumbnailBadges', default=[]):
badge_text = deep_get(badge, 'thumbnailBadgeViewModel', 'text', default='')
if badge_text and not info.get('video_count'):
conservative_update(info, 'video_count', extract_int(badge_text))
# Extract author info if available
info['author'] = None
info['author_id'] = None
info['author_url'] = None
info['description'] = None
info['badges'] = []
# Try to get first video ID from inline player data
item_playback = item.get('itemPlayback', {})
inline_player = item_playback.get('inlinePlayerData', {})
on_select = inline_player.get('onSelect', {})
innertube_cmd = on_select.get('innertubeCommand', {})
watch_endpoint = innertube_cmd.get('watchEndpoint', {})
if watch_endpoint.get('videoId'):
info['first_video_id'] = watch_endpoint.get('videoId')
info.update(additional_info)
return info
def extract_shorts_lockup_view_model_info(item, additional_info={}):
"""Extract info from shortsLockupViewModel format (YouTube Shorts)"""
info = {'error': None, 'type': 'video'}
# Video ID from reelWatchEndpoint or entityId
info['id'] = deep_get(item,
'onTap', 'innertubeCommand', 'reelWatchEndpoint', 'videoId')
if not info['id']:
entity_id = item.get('entityId', '')
if entity_id.startswith('shorts-shelf-item-'):
info['id'] = entity_id[len('shorts-shelf-item-'):]
# Thumbnail
info['thumbnail'] = normalize_url(deep_get(item,
'onTap', 'innertubeCommand', 'reelWatchEndpoint',
'thumbnail', 'thumbnails', 0, 'url'))
# Parse title and views from accessibilityText
# Format: "Title, N views - play Short"
acc_text = item.get('accessibilityText', '')
info['title'] = ''
info['view_count'] = None
info['approx_view_count'] = None
if acc_text:
# Remove trailing " - play Short"
cleaned = re.sub(r'\s*-\s*play Short$', '', acc_text)
# Split on last comma+views pattern to separate title from view count
match = re.match(r'^(.*?),\s*([\d,.]+\s*(?:thousand|million|billion|)\s*views?)$',
cleaned, re.IGNORECASE)
if match:
info['title'] = match.group(1).strip()
view_text = match.group(2)
info['view_count'] = extract_int(view_text)
# Convert "7.1 thousand" -> "7.1 K" for display
suffix_map = {'thousand': 'K', 'million': 'M', 'billion': 'B'}
suffix_match = re.search(r'([\d,.]+)\s*(thousand|million|billion)?', view_text, re.IGNORECASE)
if suffix_match:
num = suffix_match.group(1)
word = suffix_match.group(2)
if word:
info['approx_view_count'] = num + ' ' + suffix_map[word.lower()]
else:
info['approx_view_count'] = '{:,}'.format(int(num.replace(',', ''))) if num.isdigit() or num.replace(',','').isdigit() else num
else:
info['approx_view_count'] = extract_approx_int(view_text)
else:
# Fallback: try "N views" at end
match2 = re.match(r'^(.*?),\s*(.+views?)$', cleaned, re.IGNORECASE)
if match2:
info['title'] = match2.group(1).strip()
info['approx_view_count'] = extract_approx_int(match2.group(2))
else:
info['title'] = cleaned
# Overlay text (usually has the title too)
overlay_metadata = deep_get(item, 'overlayMetadata',
'secondaryText', 'content')
if overlay_metadata and not info['approx_view_count']:
info['approx_view_count'] = extract_approx_int(overlay_metadata)
primary_text = deep_get(item, 'overlayMetadata',
'primaryText', 'content')
if primary_text and not info['title']:
info['title'] = primary_text
info['duration'] = ''
info['time_published'] = None
info['description'] = None
info['badges'] = []
info['author'] = None
info['author_id'] = None
info['author_url'] = None
info['index'] = None
info.update(additional_info)
return info
def extract_item_info(item, additional_info={}):
if not item:
return {'error': 'No item given'}
@@ -243,6 +427,14 @@ def extract_item_info(item, additional_info={}):
info['type'] = 'unsupported'
return info
# Handle new lockupViewModel format (YouTube 2024+)
if type == 'lockupViewModel':
return extract_lockup_view_model_info(item, additional_info)
# Handle shortsLockupViewModel format (YouTube Shorts)
if type == 'shortsLockupViewModel':
return extract_shorts_lockup_view_model_info(item, additional_info)
# type looks like e.g. 'compactVideoRenderer' or 'gridVideoRenderer'
# camelCase split, https://stackoverflow.com/a/37697078
type_parts = [s.lower() for s in re.sub(r'([A-Z][a-z]+)', r' \1', type).split()]
@@ -282,9 +474,9 @@ def extract_item_info(item, additional_info={}):
['detailedMetadataSnippets', 0, 'snippetText'],
))
info['thumbnail'] = normalize_url(multi_deep_get(item,
['thumbnail', 'thumbnails', 0, 'url'], # videos
['thumbnails', 0, 'thumbnails', 0, 'url'], # playlists
['thumbnailRenderer', 'showCustomThumbnailRenderer', 'thumbnail', 'thumbnails', 0, 'url'], # shows
['thumbnail', 'thumbnails', -1, 'url'], # videos (highest quality)
['thumbnails', 0, 'thumbnails', -1, 'url'], # playlists
['thumbnailRenderer', 'showCustomThumbnailRenderer', 'thumbnail', 'thumbnails', -1, 'url'], # shows
))
info['badges'] = []
@@ -376,6 +568,13 @@ def extract_item_info(item, additional_info={}):
elif primary_type == 'channel':
info['id'] = item.get('channelId')
info['approx_subscriber_count'] = extract_approx_int(item.get('subscriberCountText'))
# YouTube sometimes puts the handle (@name) in subscriberCountText
# instead of the actual count. Fall back to accessibility data.
if not info['approx_subscriber_count']:
acc_label = deep_get(item, 'subscriberCountText',
'accessibility', 'accessibilityData', 'label', default='')
if 'subscriber' in acc_label.lower():
info['approx_subscriber_count'] = extract_approx_int(acc_label)
elif primary_type == 'show':
info['id'] = deep_get(item, 'navigationEndpoint', 'watchEndpoint', 'playlistId')
info['first_video_id'] = deep_get(item, 'navigationEndpoint',
@@ -441,6 +640,10 @@ _item_types = {
'channelRenderer',
'compactChannelRenderer',
'gridChannelRenderer',
# New viewModel format (YouTube 2024+)
'lockupViewModel',
'shortsLockupViewModel',
}
def _traverse_browse_renderer(renderer):

View File

@@ -218,40 +218,100 @@ def extract_playlist_metadata(polymer_json):
return {'error': err}
metadata = {'error': None}
header = deep_get(response, 'header', 'playlistHeaderRenderer', default={})
metadata['title'] = extract_str(header.get('title'))
metadata['title'] = None
metadata['first_video_id'] = None
metadata['thumbnail'] = None
metadata['video_count'] = None
metadata['description'] = ''
metadata['author'] = None
metadata['author_id'] = None
metadata['author_url'] = None
metadata['view_count'] = None
metadata['like_count'] = None
metadata['time_published'] = None
header = deep_get(response, 'header', 'playlistHeaderRenderer', default={})
if header:
# Classic playlistHeaderRenderer format
metadata['title'] = extract_str(header.get('title'))
metadata['first_video_id'] = deep_get(header, 'playEndpoint', 'watchEndpoint', 'videoId')
first_id = re.search(r'([a-z_\-]{11})', deep_get(header,
'thumbnail', 'thumbnails', 0, 'url', default=''))
if first_id:
conservative_update(metadata, 'first_video_id', first_id.group(1))
metadata['video_count'] = extract_int(header.get('numVideosText'))
metadata['description'] = extract_str(header.get('descriptionText'), default='')
metadata['author'] = extract_str(header.get('ownerText'))
metadata['author_id'] = multi_deep_get(header,
['ownerText', 'runs', 0, 'navigationEndpoint', 'browseEndpoint', 'browseId'],
['ownerEndpoint', 'browseEndpoint', 'browseId'])
metadata['view_count'] = extract_int(header.get('viewCountText'))
metadata['like_count'] = extract_int(header.get('likesCountWithoutLikeText'))
for stat in header.get('stats', ()):
text = extract_str(stat)
if 'videos' in text or 'episodes' in text:
conservative_update(metadata, 'video_count', extract_int(text))
elif 'views' in text:
conservative_update(metadata, 'view_count', extract_int(text))
elif 'updated' in text:
metadata['time_published'] = extract_date(text)
else:
# New pageHeaderRenderer format (YouTube 2024+)
page_header = deep_get(response, 'header', 'pageHeaderRenderer', default={})
metadata['title'] = page_header.get('pageTitle')
view_model = deep_get(page_header, 'content', 'pageHeaderViewModel', default={})
# Extract title from viewModel if not found
if not metadata['title']:
metadata['title'] = deep_get(view_model,
'title', 'dynamicTextViewModel', 'text', 'content')
# Extract metadata from rows (author, video count, views, etc.)
meta_rows = deep_get(view_model,
'metadata', 'contentMetadataViewModel', 'metadataRows', default=[])
for row in meta_rows:
for part in row.get('metadataParts', []):
text_content = deep_get(part, 'text', 'content', default='')
# Author from avatarStack
avatar_stack = deep_get(part, 'avatarStack', 'avatarStackViewModel', default={})
if avatar_stack:
author_text = deep_get(avatar_stack, 'text', 'content')
if author_text:
metadata['author'] = author_text
# Extract author_id from commandRuns
for run in deep_get(avatar_stack, 'text', 'commandRuns', default=[]):
browse_id = deep_get(run, 'onTap', 'innertubeCommand',
'browseEndpoint', 'browseId')
if browse_id:
metadata['author_id'] = browse_id
# Video/episode count
if text_content and ('video' in text_content.lower() or 'episode' in text_content.lower()):
conservative_update(metadata, 'video_count', extract_int(text_content))
# View count
elif text_content and 'view' in text_content.lower():
conservative_update(metadata, 'view_count', extract_int(text_content))
# Last updated
elif text_content and 'updated' in text_content.lower():
metadata['time_published'] = extract_date(text_content)
# Extract description from sidebar if available
sidebar = deep_get(response, 'sidebar', 'playlistSidebarRenderer', 'items', default=[])
for sidebar_item in sidebar:
desc = deep_get(sidebar_item, 'playlistSidebarPrimaryInfoRenderer',
'description', 'simpleText')
if desc:
metadata['description'] = desc
if metadata['author_id']:
metadata['author_url'] = 'https://www.youtube.com/channel/' + metadata['author_id']
metadata['first_video_id'] = deep_get(header, 'playEndpoint', 'watchEndpoint', 'videoId')
first_id = re.search(r'([a-z_\-]{11})', deep_get(header,
'thumbnail', 'thumbnails', 0, 'url', default=''))
if first_id:
conservative_update(metadata, 'first_video_id', first_id.group(1))
if metadata['first_video_id'] is None:
metadata['thumbnail'] = None
else:
metadata['thumbnail'] = f"https://i.ytimg.com/vi/{metadata['first_video_id']}/hqdefault.jpg"
metadata['video_count'] = extract_int(header.get('numVideosText'))
metadata['description'] = extract_str(header.get('descriptionText'), default='')
metadata['author'] = extract_str(header.get('ownerText'))
metadata['author_id'] = multi_deep_get(header,
['ownerText', 'runs', 0, 'navigationEndpoint', 'browseEndpoint', 'browseId'],
['ownerEndpoint', 'browseEndpoint', 'browseId'])
if metadata['author_id']:
metadata['author_url'] = 'https://www.youtube.com/channel/' + metadata['author_id']
else:
metadata['author_url'] = None
metadata['view_count'] = extract_int(header.get('viewCountText'))
metadata['like_count'] = extract_int(header.get('likesCountWithoutLikeText'))
for stat in header.get('stats', ()):
text = extract_str(stat)
if 'videos' in text:
conservative_update(metadata, 'video_count', extract_int(text))
elif 'views' in text:
conservative_update(metadata, 'view_count', extract_int(text))
elif 'updated' in text:
metadata['time_published'] = extract_date(text)
microformat = deep_get(response, 'microformat', 'microformatDataRenderer',
default={})
conservative_update(

View File

@@ -628,6 +628,7 @@ def extract_watch_info(polymer_json):
info['manual_caption_languages'] = []
info['_manual_caption_language_names'] = {} # language name written in that language, needed in some cases to create the url
info['translation_languages'] = []
info['_caption_track_urls'] = {} # lang_code -> full baseUrl from player response
captions_info = player_response.get('captions', {})
info['_captions_base_url'] = normalize_url(deep_get(captions_info, 'playerCaptionsRenderer', 'baseUrl'))
# Sometimes the above playerCaptionsRender is randomly missing
@@ -658,6 +659,10 @@ def extract_watch_info(polymer_json):
else:
info['manual_caption_languages'].append(lang_code)
base_url = caption_track.get('baseUrl', '')
# Store the full URL from the player response (includes valid tokens)
if base_url:
normalized = normalize_url(base_url) if base_url.startswith('/') or not base_url.startswith('http') else base_url
info['_caption_track_urls'][lang_code + ('_asr' if caption_track.get('kind') == 'asr' else '')] = normalized
lang_name = deep_get(urllib.parse.parse_qs(urllib.parse.urlparse(base_url).query), 'name', 0)
if lang_name:
info['_manual_caption_language_names'][lang_code] = lang_name
@@ -825,6 +830,21 @@ def captions_available(info):
def get_caption_url(info, language, format, automatic=False, translation_language=None):
'''Gets the url for captions with the given language and format. If automatic is True, get the automatic captions for that language. If translation_language is given, translate the captions from `language` to `translation_language`. If automatic is true and translation_language is given, the automatic captions will be translated.'''
# Try to use the direct URL from the player response first (has valid tokens)
track_key = language + ('_asr' if automatic else '')
direct_url = info.get('_caption_track_urls', {}).get(track_key)
if direct_url:
url = direct_url
# Override format
if '&fmt=' in url:
url = re.sub(r'&fmt=[^&]*', '&fmt=' + format, url)
else:
url += '&fmt=' + format
if translation_language:
url += '&tlang=' + translation_language
return url
# Fallback to base_url construction
url = info['_captions_base_url']
if not url:
return None