I want to extract the information from a YouTube playlist but querying the whole playlist at once seems to be quite unreliable even if I use the ignoreerrors
flag, because sometimes it gets stuck, especially if the internet connection is a bit shaky.
Should I just download the playlist one by one by setting the playliststart
and playlistend
values and processing it in a loop?
My current code looks like this:
simulate_ydl_opts = {
'format': "251",
'playlistend': 50,
'ignoreerrors': True,
'simulate':True
}
youtube_dl_object = youtube_dl.YoutubeDL(simulate_ydl_opts)
test_info=youtube_dl_object.extract_info("https://www.youtube.com/user/Rasenfunk")
IMHO, I think you can use the ratelimit: 50000
as a configuration, in case it depends on your download speed, coupled with playliststart
, playlistend
, retries
and continuedl
.
As per their github common module, ratelimit
is
Download speed limit, in bytes/sec.
If you already know the speed at which this download needs to be capped at and then try the same download. It's basically going to help you to throttle if your bandwidth can't handle it.
In case you are not sure of the max limit, I suggest using something like speedtest-cli, wherein you can identify the download speed and apply that for throttling the speed, I have just hit this sample code to try something out:
import speedtest
import math
import youtube_dl
s = speedtest.Speedtest()
s.get_best_server()
s.download()
# This get's the download in bits per second
print(s.results.dict()['download'])
simulate_ydl_opts = {
'format': "251",
'playlistend': 50,
'ignoreerrors': True,
'verbose': True, # Might give you a clue on the Download speed as it prints
'ratelimit': s.results.dict()['download'],
'retries': 10, #retry 10 times if there is a failure
'continuedl': True #Try to continue downloads if possible.
}
youtube_dl_object = youtube_dl.YoutubeDL(simulate_ydl_opts)
test_info=youtube_dl_object.extract_info("https://www.youtube.com/user/Rasenfunk")
Yes, you should probably try doing it one at a time. I recommend you also make your program keep track of what URL it got last so it can continue from where it last left off. A threading based timeout-and-restart system (For each video, make a new thread and include a timeout) would help the process go more seamlessly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With