Commit Graph

47 Commits

Author SHA1 Message Date
a374f90f6e fix: add support for YouTube Shorts tab on channel pages
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 56s
- Rewrite channel_ctoken_v5 with correct protobuf field numbers per tab
  (videos=15, shorts=10, streams=14) based on Invidious source
- Replace broken pbj=1 endpoint with youtubei browse API for shorts/streams
- Add shortsLockupViewModel parser to extract video data from new YT format
- Fix channel metadata not loading (get_metadata now uses browse API)
- Fix metadata caching: skip caching when channel_name is absent
- Show actual item count instead of UU playlist count for shorts/streams
- Format view counts with spaced suffixes (7.1 K, 1.2 M, 3 B)
2026-04-01 11:43:46 -05:00
06051dd127 fix: support YouTube 2024+ data formats for playlists, podcasts and channels
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 13s
CI / test (push) Successful in 51s
- Add PODCAST content type support in lockupViewModel extraction
- Extract thumbnails and episode count from thumbnail overlay badges
- Migrate playlist page fetching from pbj=1 to innertube API (youtubei/v1/browse)
- Support new pageHeaderRenderer format in playlist metadata extraction
- Fix subscriber count extraction when YouTube returns handle instead of count
- Hide "None subscribers" in template when data is unavailable
2026-03-31 21:38:51 -05:00
56ecd6cb1b fix: use YouTube-provided thumbnail URLs instead of hardcoded hq720.jpg
All checks were successful
git-sync-with-mirror / git-sync (push) Successful in 15s
CI / test (push) Successful in 58s
Videos without hq720.jpg thumbnails caused mass 404 errors.
Now preserves the actual thumbnail URL from YouTube's API response,
falls back to hqdefault.jpg only when no thumbnail is provided.
Also picks highest quality thumbnail from API (thumbnails[-1])
and adds progressive fallback for subscription/download functions.
2026-03-27 19:22:12 -05:00
6a68f06645 Release v0.4.0 - HD Thumbnails, YouTube 2024+ Support, and yt-dlp Integration
Some checks failed
CI / test (push) Failing after 1m19s
Major Features:
- HD video thumbnails (hq720.jpg) with automatic fallback to lower qualities
- HD channel avatars (240x240 instead of 88x88)
- YouTube 2024+ lockupViewModel support for channel playlists
- youtubei/v1/browse API integration for channel playlist tabs
- yt-dlp integration for multi-language audio and subtitles

Bug Fixes:
- Fixed undefined `abort` import in playlist.py
- Fixed undefined functions in proto.py (encode_varint, bytes_to_hex, succinct_encode)
- Fixed missing `traceback` import in proto_debug.py
- Fixed blurry playlist thumbnails using default.jpg instead of HD versions
- Fixed channel playlists page using deprecated pbj=1 format

Improvements:
- Automatic thumbnail fallback system (hq720 → sddefault → hqdefault → mqdefault → default)
- JavaScript thumbnail_fallback() handler for 404 errors
- Better thumbnail quality across all pages (watch, channel, playlist, subscriptions)
- Consistent HD avatar display for all channel items
- Settings system automatically adds new settings without breaking user config

Files Modified:
- youtube/watch.py - HD thumbnails for related videos and playlist items
- youtube/channel.py - HD thumbnails for channel playlists, youtubei API integration
- youtube/playlist.py - HD thumbnails, fixed abort import
- youtube/util.py - HD thumbnail URLs, avatar HD upgrade, prefix_url improvements
- youtube/comments.py - HD video thumbnail
- youtube/subscriptions.py - HD thumbnails, fixed abort import
- youtube/yt_data_extract/common.py - lockupViewModel support, extract_lockup_view_model_info()
- youtube/yt_data_extract/everything_else.py - HD playlist thumbnails
- youtube/proto.py - Fixed undefined function references
- youtube/proto_debug.py - Added traceback import
- youtube/static/js/common.js - thumbnail_fallback() handler
- youtube/templates/*.html - Added onerror handlers for thumbnail fallback
- youtube/version.py - Bump to v0.4.0

Technical Details:
- All thumbnail URLs now use hq720.jpg (1280x720) when available
- Fallback handled client-side via JavaScript onerror handler
- Server-side avatar upgrade via regex in util.prefix_url()
- lockupViewModel parser extracts contentType, metadata, and first_video_id
- Channel playlist tabs now use youtubei/v1/browse instead of deprecated pbj=1
- Settings version system ensures backward compatibility
2026-03-22 20:50:03 -05:00
5f3b90ad45 Fix channel about tab 2024-01-22 06:29:42 +08:00
Jesus
5594d017e2 Fix related vids, like_count, playlist sometimes missing
Cause is that some pages have the onResponseReceivedEndpoints key
at the top level with useless stuff in it, and the extract_items
function was searching in that instead of the 'contents' key.

Change to use if blocks instead of elif blocks in the
extract_items function.
2023-09-11 04:13:56 +08:00
Jesus E
0f4bf45cde Fix minor formatting issues 2023-06-17 16:14:59 -04:00
Jesus E
d7f934b7b2 Merge short and video parsing even further
Use multi_get and multi_deep_get for tag differences
Replace the duration check with conservative_update
2023-06-17 16:14:02 -04:00
Jesus E
a4299dc917 Merge short and video parsing 2023-06-17 16:10:59 -04:00
Jesus E
e6fd9b40f4 Fix parsing shorts
Add check for extracting duration for shorts
Make short duration extraction stricter
Fix handling shorts with no views
2023-06-17 16:08:52 -04:00
Jesus E
f322035d4a Add functional but preliminary channel tab support
Add channel tabs to the channel template and script
Update continuation token to request different tabs

Add support for 'reelItemRenderer' format required to extract shorts
2023-06-17 16:05:40 -04:00
Jesus E
aa57ace742 Fix music list extraction
Closes #160
2023-05-28 21:42:13 -04:00
Jesus E
68752000f0 Update channel to new ctoken format
Huge thanks to @michaelweiser

Different sortings still don't work for videos and playlists
2023-05-28 21:04:36 -04:00
Jesús
f3469b1ff4 Revert "Usage hqdefault thumbnail in related videos"
This reverts commit a0c3ca0159.
2021-09-14 16:35:04 -05:00
Jesús
a0c3ca0159 Usage hqdefault thumbnail in related videos 2021-09-14 15:58:13 -05:00
James Taylor
7c79f530a5 Support more audio and video qualities
Adds support for AV1-encoded videos, which includes any videos
above 1080p. These weren't getting included because they did
not have a quality entry in the format table at the top of
watch_extraction.py. So get the quality from the quality
labels of the format if it's not there.

Because YouTube often includes BOTH AV1 and H.264 (AVC) for each
quality, after these are included, there will be way too many
quality options and the code needs to choose which one to use.
The choice is somewhat hard: AV1 is encoded in fewer bytes than
H.264 and is patent-free, however, it has less hardware support,
so might be more difficult to play. For instance, on my system,
AV1 does not work on 1080p, but H.264 does. Adds a setting about
which to prefer, set to H.264 as the default.

Also adds support for the lower quality mp4 audio quality, which
now gets used at 144p to save network bandwidth. For similar
reasons, this was not getting included because it did not
have an audio_bitrate entry in the table. Prefer bitrate
instead for the quality.

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-08-31 16:40:19 -05:00
James Taylor
4e556efa3d Fix comments extraction due to new response continuation key name
Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-08-23 18:40:52 -05:00
James Taylor
40fcee52c0 Fix description extraction in search results
Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-08-09 12:29:01 -05:00
James Taylor
2039972ab3 Fix (dis)like, music list extraction due to YouTube changes (again)
YouTube reverted the changes they made that prompted f9f5d5ba.

In case they change their minds again, this adds support for both
formats.

The liberal_update and conservative_update functions needed to be
modified to handle the cases of empty lists, so that
a successfully extracted 'music_list': [{'Author':...},...] will
not be overwritten by 'music_list': [] in the calls to
liberal_dict_update.

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-08-09 12:13:52 -05:00
James Taylor
3dee7ea0d1 Switch to new comments api now that old one is being disabled
watch_comment api periodically gives the error "Top level
comments mweb servlet is turned down."

The continuation items for the new api are in a different
arrangement in the json, so changes were necessary to the
extract_items function.

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-08-09 12:10:42 -05:00
James Taylor
54b39f1303 Fix missing likes, dislikes, & music list due to Youtube changes
Also moves some microformat extraction from
_extract_watch_info_mobile to extract_watch_info where it belongs.
_extract_watch_info_mobile is really only for stuff visible on the
page, and thus specialized for either mobile or desktop.

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-07-28 23:47:41 -05:00
Jesús
7fd2c3474f Capitalize name app 2021-06-10 16:41:45 -05:00
James Taylor
f0cd170767 Fix videos added to playlist from channel page not having author
Information from additional_info was being overrided with None.

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-05-17 22:02:03 -05:00
James Taylor
e549b5f67c Channel: Allow going to next pages of playlists page
Uses previous and next buttons. Now can view more than just
first page of playlists page

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-03-15 22:22:15 -05:00
James Taylor
2df4238924 Use new channel api endpoint now that browse_ajax is disabled
Fixes channel pages > 1

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-03-03 10:40:02 -05:00
James Taylor
1cc0ffcb20 yt_data_ext: support richGrid&richItem sometimes used on search
Some searches have these renderers instead of the usual ones

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-02-13 17:29:05 -05:00
James Taylor
6b6a6653a0 Fix youtube mixes
They cannot be viewed on their own, so change url in items to
go to the video+playlist instead

Signed-off-by: Jesús <heckyel@hyperbola.info>
2020-12-18 23:39:25 -05:00
zrose584
a27b575380 remove trailing whitespaces 2020-10-21 10:35:01 +02:00
James Taylor
75e8930958 yt_data_extract: normalize thumbnail and author urls
for instance, urls that start with // become https://

adjustment required in comments.py because the url was left as a
relative url in yt_data_extract by mistake and was using URL_ORIGIN
prefix as fix.

see #31
2020-10-19 12:55:03 -07:00
James Taylor
4bedf55461 yt_data_extract: Fix time_published picking up 'Streaming' string
This was causing an exception in subscriptions when it tried
to estimate the unix timestamp for the upload time
2020-08-12 14:40:47 -07:00
James Taylor
fa61874f97 extract_items: Handle case where continuation has multiple
[something]Continuation renderers, all of which are junk
except one. Check the items in each one until the one which
contains the items being sought is found.
The usage in extract_comments_info needed to be changed to
specify the items being sought. It was unspecified before which
is strictly incorrect since extract_items by default looks for
video/playlist/channel thumbnail items. It was relying on this
special case for continuations. But now that wouldn't work
anymore.
2020-08-11 19:59:25 -07:00
James Taylor
1224dd88a3 Fix related video extraction sometimes failing
Youtube added some pointless variation in variable names
2020-04-10 13:09:38 -07:00
James Taylor
5554d5afff Add playlist sidebar for videos in playlist, including autoplay 2020-04-04 22:52:09 -07:00
James Taylor
113c75801a Fix playlist id extraction for radio renderers 2019-12-31 18:06:31 -08:00
James Taylor
506dbb552a Extraction: Correctly extract view_count for vids with 0 views.
Also change superfluous use of multi_get to item.get nearby
2019-12-30 16:18:38 -08:00
James Taylor
0c6a37e9aa extract_items: allow extracting items that are normally dug into for more
By checking first if it's in item_types rather than checking if it can be dug into first.
For example: this allows extracting things like sectionListRenderer
2019-12-26 19:39:48 -08:00
James Taylor
8e8a1b70b6 yt_data_extract: Split up extract_items so renderer extraction works independently
extract_items_from_renderer will extract given just a renderer rather than a response
2019-12-26 19:02:13 -08:00
James Taylor
b027f66738 yt_data_extract.common: Simplify usage of get functions and remove dead code
Change usage of multi_deep_get to multi_get where possible
Remove checking of type from calls to get functions (because it's very unlikely Youtube suddenly changes the type without changing the name of the variable or anything, and it takes up unnecessary space)
Remove all default=None arguments from get functions, since those are superflous.
Remove list_types constant since it's no longer in use.
2019-12-26 18:49:04 -08:00
James Taylor
c7edea0848 yt_data_extract: Simplify extract_items so it needs only 1 while loop 2019-12-26 18:38:18 -08:00
James Taylor
f706689a56 extract_item_info: Don't extract author, author_id, etc. for channel items
Philosophically, a channel doesn't create itself.
2019-12-24 13:11:21 -08:00
James Taylor
3200d66d88 Fix extract_approx_int not working for non-approx ints, make extract_int more robust
For example, "354 subscribers" wasn't being extracted correctly be extract_approx_int.
Make extract_approx_int and extract_int only extract integers that are words.
So e.g. 342 will not be extracted from internetuser342
2019-12-24 13:07:12 -08:00
James Taylor
7a6bcb6128 Rewrite channel extraction with proper error handling and new extraction names. Extract subscriber_count correctly.
Don't just shove english strings into info['stats']. Actually give semantic names for the stats.
2019-12-21 15:45:01 -08:00
James Taylor
3936310e7e Fix extract_approx_int. Fixes incorrect subscriber count on channels.
It wasn't working because decimals such as 15.1M weren't considered, so it was extracting "1M"
2019-12-21 15:44:03 -08:00
James Taylor
a9f67d4630 Fix regression: date extraction broken. Move constants to correct file in yt_data_extract 2019-12-20 18:48:40 -08:00
James Taylor
4a3529df95 Extraction: Move stuff around in files and put underscores in front of internal helper function names
Move get_captions_url in watch_extraction to bottom next to other exported, public functions
2019-12-19 20:12:37 -08:00
James Taylor
d1d908d5b1 Extraction: Move html post processing stuff from yt_data_extract to util 2019-12-19 19:48:53 -08:00
James Taylor
76376b29a0 Extraction: Split yt_data_extract.py into multiple files 2019-12-19 19:29:47 -08:00