James Taylor
1224dd88a3
Fix related video extraction sometimes failing
...
Youtube added some pointless variation in variable names
2020-04-10 13:09:38 -07:00
James Taylor
5554d5afff
Add playlist sidebar for videos in playlist, including autoplay
2020-04-04 22:52:09 -07:00
James Taylor
113c75801a
Fix playlist id extraction for radio renderers
2019-12-31 18:06:31 -08:00
James Taylor
506dbb552a
Extraction: Correctly extract view_count for vids with 0 views.
...
Also change superfluous use of multi_get to item.get nearby
2019-12-30 16:18:38 -08:00
James Taylor
0c6a37e9aa
extract_items: allow extracting items that are normally dug into for more
...
By checking first if it's in item_types rather than checking if it can be dug into first.
For example: this allows extracting things like sectionListRenderer
2019-12-26 19:39:48 -08:00
James Taylor
8e8a1b70b6
yt_data_extract: Split up extract_items so renderer extraction works independently
...
extract_items_from_renderer will extract given just a renderer rather than a response
2019-12-26 19:02:13 -08:00
James Taylor
b027f66738
yt_data_extract.common: Simplify usage of get functions and remove dead code
...
Change usage of multi_deep_get to multi_get where possible
Remove checking of type from calls to get functions (because it's very unlikely Youtube suddenly changes the type without changing the name of the variable or anything, and it takes up unnecessary space)
Remove all default=None arguments from get functions, since those are superflous.
Remove list_types constant since it's no longer in use.
2019-12-26 18:49:04 -08:00
James Taylor
c7edea0848
yt_data_extract: Simplify extract_items so it needs only 1 while loop
2019-12-26 18:38:18 -08:00
James Taylor
f706689a56
extract_item_info: Don't extract author, author_id, etc. for channel items
...
Philosophically, a channel doesn't create itself.
2019-12-24 13:11:21 -08:00
James Taylor
3200d66d88
Fix extract_approx_int not working for non-approx ints, make extract_int more robust
...
For example, "354 subscribers" wasn't being extracted correctly be extract_approx_int.
Make extract_approx_int and extract_int only extract integers that are words.
So e.g. 342 will not be extracted from internetuser342
2019-12-24 13:07:12 -08:00
James Taylor
7a6bcb6128
Rewrite channel extraction with proper error handling and new extraction names. Extract subscriber_count correctly.
...
Don't just shove english strings into info['stats']. Actually give semantic names for the stats.
2019-12-21 15:45:01 -08:00
James Taylor
3936310e7e
Fix extract_approx_int. Fixes incorrect subscriber count on channels.
...
It wasn't working because decimals such as 15.1M weren't considered, so it was extracting "1M"
2019-12-21 15:44:03 -08:00
James Taylor
a9f67d4630
Fix regression: date extraction broken. Move constants to correct file in yt_data_extract
2019-12-20 18:48:40 -08:00
James Taylor
4a3529df95
Extraction: Move stuff around in files and put underscores in front of internal helper function names
...
Move get_captions_url in watch_extraction to bottom next to other exported, public functions
2019-12-19 20:12:37 -08:00
James Taylor
d1d908d5b1
Extraction: Move html post processing stuff from yt_data_extract to util
2019-12-19 19:48:53 -08:00
James Taylor
76376b29a0
Extraction: Split yt_data_extract.py into multiple files
2019-12-19 19:29:47 -08:00