57 Commits

Author SHA1 Message Date
97972d6fa3
Fix like count extraction 2024-01-22 06:35:46 +08:00
51a1693789
Fix comment count extraction due to 'K/M' postfixes
YouTube now displays 2K comments instead of 2359, for instance
2024-01-22 05:59:11 +08:00
Jesus
83af4ab0d7
Fix comment count not extracted sometimes
YouTube created a new key 'commentCount' in addition to 'headerText'
2023-09-11 04:15:25 +08:00
Jesus
cb4ceefada
Filter out translated audio tracks
See comment in code
2023-09-11 04:06:11 +08:00
Jesus E
21224c8dae
watch_extraction.py: fix conditional 2023-06-17 16:25:34 -04:00
Jesus E
74907a8183
Music list extraction: read from SONG field
This one is used when there is no corresponding YouTube video
for the track
2023-05-28 21:45:20 -04:00
Jesus E
aa57ace742
Fix music list extraction
Closes #160
2023-05-28 21:42:13 -04:00
Jesus E
e54596f3e9
Partially fix age restricted videos
Does not work for videos that require decryption because
decryption is not working (giving 403) for some reason.

Related invidious issue for decryption not working:
https://github.com/iv-org/invidious/issues/3245

Partial fix for #146
2023-05-28 21:30:51 -04:00
Jesus E
7b60751e99
Fix failure to detect vp9.2 and mp4v.20.3 codecs 2023-05-28 20:47:47 -04:00
Jesus E
9890617098
Fix fmt extraction mime_type regex failure as well as exceptions 2023-05-28 20:44:30 -04:00
Jesus E
0f78f07875
Remove leftover print statement 2023-05-28 20:40:25 -04:00
Jesus E
08545a29df
Fix likes count 2023-05-28 20:39:11 -04:00
1fbc0cdd46
Fix preview_thumbnails
use 'deep_get' for storyboard
2022-05-30 22:45:08 +08:00
James Taylor
79fd2966cd
Extract captions base_url using different method when missing
The base url will be randomly missing.

Take one of the listed captions urls which already
has the &lang and automatic specifiers. Then remove these
specifiers.

Signed-off-by: Jesús <heckyel@hyperbola.info>
2022-03-30 00:41:30 +08:00
James Taylor
dcd4b0f0ae
Fix exception when _captions_base_url is not present
Signed-off-by: Jesús <heckyel@hyperbola.info>
2022-03-30 00:37:43 +08:00
zrose584
a5ef801c07
handle missing storyboard
Signed-off-by: Jesús <heckyel@hyperbola.info>
2022-01-17 09:01:09 -05:00
zrose584
63c92e0c4e
add preview thumbnails
Signed-off-by: Jesús <heckyel@hyperbola.info>
2022-01-09 16:39:50 -05:00
a1d3cc5045
update formats 2021-12-27 13:05:54 -05:00
92067638b1
Disable dislikes
Ref: https://blog.youtube/news-and-events/update-to-youtube/
2021-12-26 13:29:55 -05:00
James Taylor
9c7e93ecf8
Redo av codec settings & selections to accomodate webm
Allows for ranked preferences for h264, av1, and vp9 codecs in
settings, along with equal preferences which are tiebroken using
smaller file size.

For each quality, gives av-merge a list of video sources
and audio sources sorted based on preference & file size. It
will pick the first one that the browser supports.

Closes #84

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-09-06 16:18:11 -05:00
James Taylor
7c79f530a5
Support more audio and video qualities
Adds support for AV1-encoded videos, which includes any videos
above 1080p. These weren't getting included because they did
not have a quality entry in the format table at the top of
watch_extraction.py. So get the quality from the quality
labels of the format if it's not there.

Because YouTube often includes BOTH AV1 and H.264 (AVC) for each
quality, after these are included, there will be way too many
quality options and the code needs to choose which one to use.
The choice is somewhat hard: AV1 is encoded in fewer bytes than
H.264 and is patent-free, however, it has less hardware support,
so might be more difficult to play. For instance, on my system,
AV1 does not work on 1080p, but H.264 does. Adds a setting about
which to prefer, set to H.264 as the default.

Also adds support for the lower quality mp4 audio quality, which
now gets used at 144p to save network bandwidth. For similar
reasons, this was not getting included because it did not
have an audio_bitrate entry in the table. Prefer bitrate
instead for the quality.

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-08-31 16:40:19 -05:00
James Taylor
c9a75042d2
Add support for more qualities, merging video+audio using MSE
Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-08-29 18:48:56 -05:00
e4af99fd17
Revert "Add support for more qualities, merging video+audio using MSE"
This reverts commit d56df02e7b1eba86baf511289208295b1f6c5a50.
2021-08-29 18:48:01 -05:00
James Taylor
d56df02e7b
Add support for more qualities, merging video+audio using MSE
Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-08-29 18:44:26 -05:00
James Taylor
2039972ab3
Fix (dis)like, music list extraction due to YouTube changes (again)
YouTube reverted the changes they made that prompted f9f5d5ba.

In case they change their minds again, this adds support for both
formats.

The liberal_update and conservative_update functions needed to be
modified to handle the cases of empty lists, so that
a successfully extracted 'music_list': [{'Author':...},...] will
not be overwritten by 'music_list': [] in the calls to
liberal_dict_update.

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-08-09 12:13:52 -05:00
James Taylor
f27105fa7f
New age restriction bypass method since get_video_info was disabled
From
https://github.com/yt-dlp/yt-dlp/issues/574#issuecomment-887171136

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-07-28 23:48:54 -05:00
James Taylor
54b39f1303
Fix missing likes, dislikes, & music list due to Youtube changes
Also moves some microformat extraction from
_extract_watch_info_mobile to extract_watch_info where it belongs.
_extract_watch_info_mobile is really only for stuff visible on the
page, and thus specialized for either mobile or desktop.

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-07-28 23:47:41 -05:00
7fd2c3474f
Capitalize name app 2021-06-10 16:41:45 -05:00
James Taylor
31fe1dac55
Fix signature decryption due to new base.js minifier rules
YouTube now includes e.g. {"fe": ...} instead of just {fe: ...}
in the javascript object entries in the object holding the
operation definitions.

Fixes #2

Signed-off-by: Jesús <heckyel@hyperbola.info>
2021-02-23 17:17:06 -05:00
James Taylor
5edcaa4f9d
Improve ytInitialPlayerResponse extraction
Makes it work if there are additional javascripts statements
after the playerResponse variable

Signed-off-by: Jesús <heckyel@hyperbola.info>
2020-12-17 11:00:04 -05:00
James Taylor
9d0be82e74 Always extract from html watch page to get base.js url
Youtube removed the url from the pbj responses. They are now
only in the html page. Replaces previous fix for the missing
base.js issue.
2020-12-12 23:11:54 -08:00
James Taylor
6443cedf62 Retrieve base.js url from html watch page when it's missing
Fixes failure mode 3 in #22
2020-12-09 17:08:12 -08:00
James Taylor
0589cfb8f7 yt_data_ext: watch playlist: Fix missing author_url if no author_id
Embedded playlist info was missing author_url key if author_id was
None. This caused KeyError in watch.py when it expected that key

Closes #37
2020-11-08 10:08:20 -08:00
James Taylor
f8b6db1480 Redo fix for failure mode 1 in issue #22
Previous fix didn't work. Should work now. The non-embedded player
response can still be present but the urls will be missing.
2020-10-21 22:42:07 -07:00
zrose584
a27b575380 remove trailing whitespaces 2020-10-21 10:35:01 +02:00
James Taylor
c9d0f685a4 Use get_video_info to get video urls if player response missing
Fixes failure mode 1 in #22
2020-10-19 13:53:57 -07:00
James Taylor
20152a6316 Specify video height in html so page doesn't shift down after load
Use true video height extracted from youtube to handle videos
shorter than their quality size. (e.g. widescreen videos)
2020-09-24 18:50:54 -07:00
James Taylor
803c901445 Fix hls_manifest_url not included when there's no other formats
Since there are no formats, it was retrying with the
non-embedded playerResponse, which resulted in the
hls_manifest_urls from the embedded player_response being
overwritten with None. So use conservative_update instead
2020-06-28 18:18:04 -07:00
James Taylor
aa3e5aa441 Add dialog for copying urls to external player for livestreams
Also for livestreams which are over whose other sources
aren't present or aren't ready yet.
2020-06-28 17:52:24 -07:00
James Taylor
6e14a8547d Handle case where embedded player response missing
Change so it extracts other stuff from regular playerResponse
Extract formats from embedded player response, but fallback to
regular one if that doesn't work.
Sometimes there is no 'player' at top_level and the urls are in
the regular playerResponse
2020-06-28 13:18:54 -07:00
James Taylor
0b5d6fe1ed Do not override previous playability error if unknown 2020-06-28 12:46:04 -07:00
James Taylor
b4450ec4bb Fix previously live videos labeled as live 2020-05-29 15:34:33 -07:00
James Taylor
bdac6a2302 Fix broken signature decryption
The base.js url format changed, so the identifier at the end
was no longer unique. So it was using the wrong cached decryption
function

Changes the identifier to just be the whole url so
this won't happen again.
2020-05-27 12:15:41 -07:00
James Taylor
85db7e46ed Fix urls sometimes not extracted due to youtube changes
The 'cipher' parameter which contains the url is sometimes called
'signatureCipher' instead now.
2020-05-27 11:56:30 -07:00
James Taylor
f1f77c4d77 Fix error getting exit node ip if format urls are None 2020-05-27 11:14:52 -07:00
James Taylor
b2f482f1fb Fix comment count & disabled extraction not working sometimes
because of A/B test.
2020-04-10 13:57:11 -07:00
James Taylor
3e09193eaf Fix exception due to missing 'playlist' key in extracted info
Happens when there's an error on the page and there was no
visible stuff on the page. 'playlist' wasn't set to None in that
case.
2020-04-05 17:27:43 -07:00
James Taylor
4d9d8cec6f Fix error when there's a video format with mimetype class of 'text' 2020-04-04 22:53:49 -07:00
James Taylor
5554d5afff Add playlist sidebar for videos in playlist, including autoplay 2020-04-04 22:52:09 -07:00
James Taylor
8c2b81094e yt_data_extract: fix missing variables in info for unavailable videos
'ip_address' was not set when no formats are available
'allowed_countries' was set to None rather than [] in extract_desktop_info which it turns out is the function that gets used in these cases
2020-02-17 20:15:59 -08:00