Sometimes YouTube redirects to a google.com/sorry page, seemingly
setting up redirect loops. Other times the url redirects
to itself.
Signed-off-by: Jesús <heckyel@hyperbola.info>
watch_comment api periodically gives the error "Top level
comments mweb servlet is turned down."
The continuation items for the new api are in a different
arrangement in the json, so changes were necessary to the
extract_items function.
Signed-off-by: Jesús <heckyel@hyperbola.info>
New 429 captcha page doesn't have IP. This new page appears to
match the 429 code plus the json of {"redirect": ...} which would
be occasionally received when the pbj json endpoint was used in
the past.
Closes#22
Signed-off-by: Jesús <heckyel@hyperbola.info>
Info parsing is handled by yt_data_extract, and html
post-processing is done with util.prefix_urls and
util.add_extra_html_info
Signed-off-by: Jesús <heckyel@hyperbola.info>
This function was only necessary with the old ajax format, which
was removed in 4d7bba92eb62518e2273d030235214f4a7605444
Signed-off-by: Jesús <heckyel@hyperbola.info>
The request can be retried immediately after the first
new identity, but if we do more new identities, we have to wait
for at least 6 seconds before doing the request, otherwise
it won't be done on a new ip based on my experiments.
Potential issue: If after getting third new identity, request
takes > 12 seconds (since timeout is 15) and returns 429, then the
Tor Manager will let it do a 4th try instead of giving up (meaning
request is taking forever from user's perspective).
Should be a very rare occurence however.
Signed-off-by: Jesús <heckyel@hyperbola.info>
Includes non-tor video routing by default, so no more chances
of the browser leaking headers or user agent to googlevideo
Adjust settings upgrade system to facilitate change to route_tor
setting.
Add some more space on settings page for dropdown settings so does
not overflow due to options with long names.
Closes#7
Specifically, fix failures when any of the fields from the parsed
comment are None, such as author, author_url, etc.
(failure due to string concatenation when building urls).
This is likely not a big deal since it is already assumed that video file server logs are not plugged into
Google's tracking infrastructure, but it doesn't hurt to give less info.
The default urllib3 max redirect amount was set to 3. Change it to 10 and
do not fail if there is a problem with checking for URL access. Just print
the error to the console and proceed.
Also add an unrelated remark about the bcptr=9999999999 parameter in watch.py
403 errors on the video urls happen typically when a video has copyrighted content or was livestreamed originally. They appear to not happen (or at least happen less frequently) if the Tor exit node used ipv6, however.
These occur when too many requests are coming from a Tor exit node.
Before, there would be an error page with an exception instructing users to report the issue.
But this is an expected and persistent issue.