641 Commits

Author SHA1 Message Date
James Taylor
14b9c30daf Invidious fallback: Use original format info and just substitute invidious urls
Because the invidious formats don't have all the information
2020-02-04 19:08:56 -08:00
James Taylor
9f090dbbf8 Watch page: add info box with allowed countries and tor exit node
Should help with debugging various content blocks
2020-02-01 16:16:49 -08:00
James Taylor
3f310bfc33 Adjust 429 error message. A Tor Browser restart is not required.
The New Identity button suffices to get the socks proxy to use a new exit node.
2020-02-01 15:14:26 -08:00
James Taylor
7c2736aa26 Check for 403 errors and fallback on Invidious
403 errors on the video urls happen typically when a video has copyrighted content or was livestreamed originally. They appear to not happen (or at least happen less frequently) if the Tor exit node used ipv6, however.
2020-02-01 15:09:37 -08:00
James Taylor
e364927f83 yt_data_extract: parse mimeType field for codecs
the youtube-dl formats table doesn't have all the necessary information
2020-02-01 14:23:50 -08:00
James Taylor
f787e4e202 Give a proper error message for 429 errors
These occur when too many requests are coming from a Tor exit node.
Before, there would be an error page with an exception instructing users to report the issue.
But this is an expected and persistent issue.
2020-01-31 20:06:15 -08:00
James Taylor
cd4a2fb0eb run.bat: Allow command line usage from any directory
Set the youtube-local directory as the working directory, and use setlocal
so it doesn't affect the shell the command is being run from.
2020-01-30 18:24:23 -08:00
James Taylor
cf507e2cd1 Add full visual c runtime to fix missing dll errors on fresh windows installs
On fresh installs, when no programs have been installed which install
visual c runtime as a dependency, the dlls are not present and brotli fails
to load. Bundle them in releases and make sure brotli sees them by
adding their location to the path (in run.bat)
2020-01-30 18:17:09 -08:00
James Taylor
b2a1f4ecfb Fix signature decryption.
The function body regex was capturing some unrelated new code before the actual function body. Example:

`function(a){a=a.split("");var b=[function(c,d){d=(d%c.length+c.length)%c.length;c.splice(-d).reverse().forEach(function(e){return c.unshift(e)}`

If you look closely, the closing bracket doesn't match the opening one. I have added `{` to the `[^\}]+` part to make sure it only captures matching brackets. Additionally, I've added `return a\.join\(""\)` to the end for good measure.
2020-01-24 14:11:59 -08:00
James Taylor
c13a8f677d local playlists: Display error message if no videos are selected or no playlist is chosen when using "add to playlist"
See #4
2020-01-19 14:29:50 -08:00
James Taylor
a677b47c4a Fix display of movie thumbnails in related videos 2020-01-10 09:34:14 -08:00
James Taylor
113c75801a Fix playlist id extraction for radio renderers 2019-12-31 18:06:31 -08:00
James Taylor
506dbb552a Extraction: Correctly extract view_count for vids with 0 views.
Also change superfluous use of multi_get to item.get nearby
2019-12-30 16:18:38 -08:00
James Taylor
0c6a37e9aa extract_items: allow extracting items that are normally dug into for more
By checking first if it's in item_types rather than checking if it can be dug into first.
For example: this allows extracting things like sectionListRenderer
2019-12-26 19:39:48 -08:00
James Taylor
8e8a1b70b6 yt_data_extract: Split up extract_items so renderer extraction works independently
extract_items_from_renderer will extract given just a renderer rather than a response
2019-12-26 19:02:13 -08:00
James Taylor
b027f66738 yt_data_extract.common: Simplify usage of get functions and remove dead code
Change usage of multi_deep_get to multi_get where possible
Remove checking of type from calls to get functions (because it's very unlikely Youtube suddenly changes the type without changing the name of the variable or anything, and it takes up unnecessary space)
Remove all default=None arguments from get functions, since those are superflous.
Remove list_types constant since it's no longer in use.
2019-12-26 18:49:04 -08:00
James Taylor
c7edea0848 yt_data_extract: Simplify extract_items so it needs only 1 while loop 2019-12-26 18:38:18 -08:00
James Taylor
f7a5f7fbaa items: commatize channel video count and playlist video count 2019-12-24 13:18:46 -08:00
James Taylor
f706689a56 extract_item_info: Don't extract author, author_id, etc. for channel items
Philosophically, a channel doesn't create itself.
2019-12-24 13:11:21 -08:00
James Taylor
3200d66d88 Fix extract_approx_int not working for non-approx ints, make extract_int more robust
For example, "354 subscribers" wasn't being extracted correctly be extract_approx_int.
Make extract_approx_int and extract_int only extract integers that are words.
So e.g. 342 will not be extracted from internetuser342
2019-12-24 13:07:12 -08:00
James Taylor
a428d47bde Channel searching: indicate if there's no results 2019-12-23 15:09:44 -08:00
James Taylor
9737ffcf82 Regression: Fix channel extraction 'items' key not present when there's no items.
Examples: Empty channels, no search results
2019-12-23 15:07:03 -08:00
James Taylor
777ed756dc Channel: Change search results to use next and previous page buttons
Because youtube doesn't give the number of search results, so previous behavior would give an error if a page number out of range was selected.
2019-12-23 14:39:59 -08:00
James Taylor
c56fc56fa6 Subscriptions: Cleaner error message when checking terminated channels
Don't display a nasty traceback in that case.
2019-12-22 19:00:44 -08:00
James Taylor
250723b797 Subscriptions: Make uploader name clickable, with link to channel 2019-12-22 18:51:21 -08:00
James Taylor
222117143f Finally fix video count on channels accessed through general urls, rather than just channel id.
It was set to a fake value of 1000 previously in order to ensure there would be enough page buttons.
This was because two sequential requests are necessary (one to get the channel id corresponding to the custom url, another to get the number of videos from the "all uploaded videos" playlist, the url for which can be generated from the channel id).

Since Tor has a high latency, I thought at the time that this would be too slow, but in practice it's not too big of a deal.

Introduces cachetools dependency in order to cache the function which gets the number of videos.

The get_channel_id function has also been fixed since the ajax api seems to have been removed.
2019-12-22 18:29:31 -08:00
James Taylor
bafae2837e channel.py: Refactor channel_id route logic into general channel url logic.
Deduplicates the code. channel_id logic was previously separate because of the need to get the number of videos and different page numbers
Also makes search work for general urls, not just channel_id urls
2019-12-22 18:26:00 -08:00
James Taylor
7a6bcb6128 Rewrite channel extraction with proper error handling and new extraction names. Extract subscriber_count correctly.
Don't just shove english strings into info['stats']. Actually give semantic names for the stats.
2019-12-21 15:45:01 -08:00
James Taylor
3936310e7e Fix extract_approx_int. Fixes incorrect subscriber count on channels.
It wasn't working because decimals such as 15.1M weren't considered, so it was extracting "1M"
2019-12-21 15:44:03 -08:00
James Taylor
66746d0ca8 Watch: Add padding in description box and urlize links 2019-12-20 21:00:10 -08:00
James Taylor
4b6efb0e0b Watch: display comment count and whether comments are disabled 2019-12-20 20:52:01 -08:00
James Taylor
d2ba9be7a7 Better error handling for incorrect watch page urls
- Correctly handle /embed, /watch with no video ids
- Correctly report error for this and for too short video ids
2019-12-20 20:35:05 -08:00
James Taylor
98fbdf77cb Add custom 500 error page. Display the traceback. Center and format error page in general.
Also add a link to github for reporting the exception.
2019-12-20 20:21:29 -08:00
James Taylor
80de90b1bb Add support for /embed urls 2019-12-20 19:23:15 -08:00
James Taylor
310585ae9e Subscriptions: Display currently selected tag in page title 2019-12-20 18:58:39 -08:00
James Taylor
0bc2e43822 Watch: Add border around badges such as unlisted badge
Especially for the light theme
2019-12-20 18:54:24 -08:00
James Taylor
a9f67d4630 Fix regression: date extraction broken. Move constants to correct file in yt_data_extract 2019-12-20 18:48:40 -08:00
James Taylor
c3321d31d0 Subscriptions: Display selected tag above videos.
Otherwise, it wasn't clear enough that a tag was selected.
2019-12-20 12:44:42 -08:00
James Taylor
b4406df9cf Merge branch 'modular-data-extract'
Commits in this branch are prefixed with "Extraction:"
This branch refactors data extraction. All such functionality has been moved to the yt_data_extract module.
Responses from requests are given to the module and it parses them into a consistent, more useful format.
The dependency on youtube-dl has also been dropped and this functionality has been built from scratch for these reasons:
(1) I've noticed youtube-dl breaks more often than invidious (which uses watch page extraction built from scratch) in response to changes from Youtube, so I'm hoping what I wrote will also be less brittle.
(2) Such breakage is inconvenient because I have to manually merge the fixes since I had to make changes to youtube-dl to make it do things such as extracting related videos.
(3) I have no control over error handling and request pooling with youtube-dl, since it does all the requests (these would require intrusive changes I don't want to maintain).
(4) I will now be able to finally display the number of comments and whether comments are disabled without making additional requests.
2019-12-19 21:33:54 -08:00
James Taylor
6b7a1212e3 Extraction: Move non-stateful signature decryption functionality into yt_data_extract 2019-12-19 21:28:21 -08:00
James Taylor
4a3529df95 Extraction: Move stuff around in files and put underscores in front of internal helper function names
Move get_captions_url in watch_extraction to bottom next to other exported, public functions
2019-12-19 20:12:37 -08:00
James Taylor
d1d908d5b1 Extraction: Move html post processing stuff from yt_data_extract to util 2019-12-19 19:48:53 -08:00
James Taylor
76376b29a0 Extraction: Split yt_data_extract.py into multiple files 2019-12-19 19:29:47 -08:00
James Taylor
beb0976b5b Extraction: Rewrite comment extraction, remove author_id and rename author_channel_id to that, fix bug in extract_items
author_id (an internal sql-like integer previously required for deleting and editing comments) has been removed by Youtube and is no longer required.
Remove it for simplicity.
Rename author_channel_id to author_id for consistency with other extraction attributes.
extract_items returned None for items instead of [] for empty continuation responses. Fixes that.
2019-12-19 15:50:19 -08:00
James Taylor
02848a1a32 Extraction: Adjust related videos box to fit new time_published information well
time_published will be put to the right of the view_count in related videos
Author will now always be above the other stats, since it doesn't make a difference in the big search result boxes since the description snippet is always very short
(However, it's important the author isn't inline with the other stats in related video boxes since those are so narrow and the author name can be very long)
2019-12-19 15:46:16 -08:00
James Taylor
004e14a538 Extraction: Use accessibility data to get timestamp and to get views for recommended videos 2019-12-18 20:53:11 -08:00
James Taylor
f6bf5213a5 Extraction: rename multi_get functions to more descriptive names 2019-12-18 19:43:55 -08:00
James Taylor
98777ee825 Extraction: Rewrite item_extraction for better error handling and readability, rename extracted names for more consistency 2019-12-18 19:39:16 -08:00
James Taylor
ee0a118a6c Extraction: Fix thumbnail and remove badges on related videos 2019-12-17 21:52:31 -08:00
James Taylor
e98a1965d2 Extraction: Fix mistake with age-restriction detection 2019-12-17 21:06:06 -08:00