Bug:
Traceback (most recent call last):
File "/home/rusian/yt-local/youtube/comments.py", line 180, in video_comments
post_process_comments_info(comments_info)
File "/home/rusian/yt-local/youtube/comments.py", line 81, in post_process_comments_info
comment['author'] = strip_non_ascii(comment['author'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rusian/yt-local/youtube/util.py", line 843, in strip_non_ascii
stripped = (c for c in string if 0 < ord(c) < 127)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not iterable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "src/gevent/greenlet.py", line 900, in gevent._gevent_cgreenlet.Greenlet.run
File "/home/rusian/yt-local/youtube/comments.py", line 195, in video_comments
comments_info['error'] = 'YouTube blocked the request. IP address: %s' % e.ip
^^^^
AttributeError: 'TypeError' object has no attribute 'ip'
2025-03-08T01:25:47Z <Greenlet at 0x7f251e5279c0: video_comments('hcm55lU9knw', 0, lc='')> failed with AttributeError
tests/test_util.py: 14 warnings
/home/runner/work/yt-local/youtube-local/youtube/util.py:321: DeprecationWarning: HTTPResponse.getheader() is deprecated and will be removed in urllib3 v2.1.0. Instead use HTTPResponse.headers.get(name, default).
response.getheader('Content-Encoding', default='identity'))
Sometimes YouTube redirects to a google.com/sorry page, seemingly
setting up redirect loops. Other times the url redirects
to itself.
Signed-off-by: Jesús <heckyel@hyperbola.info>
watch_comment api periodically gives the error "Top level
comments mweb servlet is turned down."
The continuation items for the new api are in a different
arrangement in the json, so changes were necessary to the
extract_items function.
Signed-off-by: Jesús <heckyel@hyperbola.info>
New 429 captcha page doesn't have IP. This new page appears to
match the 429 code plus the json of {"redirect": ...} which would
be occasionally received when the pbj json endpoint was used in
the past.
Closes#22
Signed-off-by: Jesús <heckyel@hyperbola.info>
Info parsing is handled by yt_data_extract, and html
post-processing is done with util.prefix_urls and
util.add_extra_html_info
Signed-off-by: Jesús <heckyel@hyperbola.info>
This function was only necessary with the old ajax format, which
was removed in 4d7bba92eb62518e2273d030235214f4a7605444
Signed-off-by: Jesús <heckyel@hyperbola.info>
The request can be retried immediately after the first
new identity, but if we do more new identities, we have to wait
for at least 6 seconds before doing the request, otherwise
it won't be done on a new ip based on my experiments.
Potential issue: If after getting third new identity, request
takes > 12 seconds (since timeout is 15) and returns 429, then the
Tor Manager will let it do a 4th try instead of giving up (meaning
request is taking forever from user's perspective).
Should be a very rare occurence however.
Signed-off-by: Jesús <heckyel@hyperbola.info>