YouTube reverted the changes they made that prompted f9f5d5ba.
In case they change their minds again, this adds support for both
formats.
The liberal_update and conservative_update functions needed to be
modified to handle the cases of empty lists, so that
a successfully extracted 'music_list': [{'Author':...},...] will
not be overwritten by 'music_list': [] in the calls to
liberal_dict_update.
Signed-off-by: Jesús <heckyel@hyperbola.info>
watch_comment api periodically gives the error "Top level
comments mweb servlet is turned down."
The continuation items for the new api are in a different
arrangement in the json, so changes were necessary to the
extract_items function.
Signed-off-by: Jesús <heckyel@hyperbola.info>
Also moves some microformat extraction from
_extract_watch_info_mobile to extract_watch_info where it belongs.
_extract_watch_info_mobile is really only for stuff visible on the
page, and thus specialized for either mobile or desktop.
Signed-off-by: Jesús <heckyel@hyperbola.info>
Comment reply protobuf now requires the channel id of the uploader
of the video. Otherwise the endpoint returns 500.
Instead of making the protobuf ourselves and passing this data
around through query parameters, just use the ctoken provided to us
but modify the max_replies field from 10 to 250.
Fixes#53
Signed-off-by: Jesús <heckyel@hyperbola.info>
YouTube now includes e.g. {"fe": ...} instead of just {fe: ...}
in the javascript object entries in the object holding the
operation definitions.
Fixes#2
Signed-off-by: Jesús <heckyel@hyperbola.info>
for instance, urls that start with // become https://
adjustment required in comments.py because the url was left as a
relative url in yt_data_extract by mistake and was using URL_ORIGIN
prefix as fix.
see #31
[something]Continuation renderers, all of which are junk
except one. Check the items in each one until the one which
contains the items being sought is found.
The usage in extract_comments_info needed to be changed to
specify the items being sought. It was unspecified before which
is strictly incorrect since extract_items by default looks for
video/playlist/channel thumbnail items. It was relying on this
special case for continuations. But now that wouldn't work
anymore.
Use extract_str function since it's not always 'simpleText'
Make sure we don't output an empty error message if we don't
know what it is.
channel.py: Don't check if error message is empty, check if it's
None
Since there are no formats, it was retrying with the
non-embedded playerResponse, which resulted in the
hls_manifest_urls from the embedded player_response being
overwritten with None. So use conservative_update instead
Change so it extracts other stuff from regular playerResponse
Extract formats from embedded player response, but fallback to
regular one if that doesn't work.
Sometimes there is no 'player' at top_level and the urls are in
the regular playerResponse
The base.js url format changed, so the identifier at the end
was no longer unique. So it was using the wrong cached decryption
function
Changes the identifier to just be the whole url so
this won't happen again.
'ip_address' was not set when no formats are available
'allowed_countries' was set to None rather than [] in extract_desktop_info which it turns out is the function that gets used in these cases