New 429 captcha page doesn't have IP. This new page appears to
match the 429 code plus the json of {"redirect": ...} which would
be occasionally received when the pbj json endpoint was used in
the past.
Closes#22
Signed-off-by: Jesús <heckyel@hyperbola.info>
Info parsing is handled by yt_data_extract, and html
post-processing is done with util.prefix_urls and
util.add_extra_html_info
Signed-off-by: Jesús <heckyel@hyperbola.info>
This function was only necessary with the old ajax format, which
was removed in 4d7bba92eb62518e2273d030235214f4a7605444
Signed-off-by: Jesús <heckyel@hyperbola.info>
The request can be retried immediately after the first
new identity, but if we do more new identities, we have to wait
for at least 6 seconds before doing the request, otherwise
it won't be done on a new ip based on my experiments.
Potential issue: If after getting third new identity, request
takes > 12 seconds (since timeout is 15) and returns 429, then the
Tor Manager will let it do a 4th try instead of giving up (meaning
request is taking forever from user's perspective).
Should be a very rare occurence however.
Signed-off-by: Jesús <heckyel@hyperbola.info>
e.g. if the error in get_video_info is "Video unavailable" must
include the Accept-Language header (which we have in watch_headers)
in order to get an English error message. Otherwise we get the
language of the Tor exit node region
Example: https://youtu.be/aaaaaaaaaaa
Signed-off-by: Jesús <heckyel@hyperbola.info>
The issue that code was working around happened with an older
request format (the ajax format) that was removed. The issue
does not happen with the newer polymer format.
Signed-off-by: Jesús <heckyel@hyperbola.info>
e.g. happens on vid where comments are disabled if comments
disabled in settings since the comments info object is just {}
Signed-off-by: Jesús <heckyel@hyperbola.info>
googlevideo sometimes doesn't send all video content and closes
the connection. Retry with a range request for the bytes needed
a maximum of three times.
Fixes first type of #40
Signed-off-by: Jesús <heckyel@hyperbola.info>
Last page as a substitute for sorting by oldest since sorting by
oldest doesn't allow arbitrary page numbers
Signed-off-by: Jesús <heckyel@hyperbola.info>