Can’t download youtube video

Question

I’m having trouble retrieving the YouTube video automatically. Here’s the code. The problem is the last part. download = urllib.request.urlopen(download_url).read()

# YouTube video download script
# 10n1z3d[at]w[dot]cn

import urllib.request
import sys

print("
--------------------------")
print (" YouTube Video Downloader")
print ("--------------------------
")

try:
    video_url = sys.argv[1]
except:
    video_url = input('[+] Enter video URL: ')

print("[+] Connecting...")
try:
    if(video_url.endswith('&feature=related')):
        video_id = video_url.split('www.youtube.com/watch?v=')[1].split('&feature=related')[0]
    elif(video_url.endswith('&feature=dir')):
        video_id = video_url.split('www.youtube.com/watch?v=')[1].split('&feature=dir')[0]
    elif(video_url.endswith('&feature=fvst')):
        video_id = video_url.split('www.youtube.com/watch?v=')[1].split('&feature=fvst')[0]
    elif(video_url.endswith('&feature=channel_page')):
        video_id = video_url.split('www.youtube.com/watch?v=')[1].split('&feature=channel_page')[0]
    else:
        video_id = video_url.split('www.youtube.com/watch?v=')[1]
except:
    print("[-] Invalid URL.")
    exit(1)

print("[+] Parsing token...")
try:
    url = str(urllib.request.urlopen('http://www.youtube.com/get_video_info?&video_id=' + video_id).read())
    token_value = url.split('video_id=' + video_id + '&token=')[1].split('&thumbnail_url')[0]

    download_url = "http://www.youtube.com/get_video?video_id=" + video_id + "&t=" + token_value + "&fmt=18"
except:
    url = str(urllib.request.urlopen('www.youtube.com/watch?v=' + video_id))
    exit(1)

v_url = str(urllib.request.urlopen('http://' + video_url).read())
video_title = v_url.split('"rv.2.title": "')[1].split('", "rv.4.rating"')[0]
if '&quot;' in video_title:
    video_title = video_title.replace('&quot;', '"')
elif '&amp;' in video_title:
    video_title = video_title.replace('&amp;', '&')

print("[+] Downloading " + '"' + video_title + '"...')
try:
    print(download_url)
    file = open(video_title + '.mp4', 'wb')
    download = urllib.request.urlopen(download_url).read()
    print(download)
    for line in download:
        file.write(line)
        file.close()
except:
    print("[-] Error downloading. Quitting.")
    exit(1)

print("
[+] Done. The video is saved to the current working directory(cwd).
")

There’s an error message (thanks Wooble):

Traceback (most recent call last):
  File "C:/Python31/MyLib/DrawingBoard/youtube_download-.py", line 52, in <module>
    download = urllib.request.urlopen(download_url).read()
  File "C:\Python31\lib\urllib
equest.py", line 119, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python31\lib\urllib
equest.py", line 353, in open
    response = meth(req, response)
  File "C:\Python31\lib\urllib
equest.py", line 465, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python31\lib\urllib
equest.py", line 385, in error
    result = self._call_chain(*args)
  File "C:\Python31\lib\urllib
equest.py", line 325, in _call_chain
    result = func(*args)
  File "C:\Python31\lib\urllib
equest.py", line 560, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "C:\Python31\lib\urllib
equest.py", line 353, in open
    response = meth(req, response)
  File "C:\Python31\lib\urllib
equest.py", line 465, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python31\lib\urllib
equest.py", line 391, in error
    return self._call_chain(*args)
  File "C:\Python31\lib\urllib
equest.py", line 325, in _call_chain
    result = func(*args)
  File "C:\Python31\lib\urllib
equest.py", line 473, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

rbp · Accepted Answer

The code on the original question relies on several assumptions about the content of YouTube pages and URLs (expressed in constructs such as "url.split('something=')[1]") which may not always be true. I tested it and it might depend even on which related videos show on the page. You might have tripped on any of those specificities.

Here's a cleaner version, which uses urllib to parse URLs and query strings, and which successfully downloads a video. I've removed some of the try/except which didn't do much but exit, for clarity. Incidentally, it deals with Unicode video titles by removing non-ASCII characters from the filename to which the video is saved. It also takes any numbers of YouTube URLs and downloads them all. Finally, it masks its user-agent as Chrome for Mac (which is what I currently use).

#!/usr/bin/env python3

import sys
import urllib.request
from urllib.request import urlopen, FancyURLopener
from urllib.parse import urlparse, parse_qs, unquote

class UndercoverURLopener(FancyURLopener):
    version = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-US) AppleWebKit/533.2 (KHTML, like Gecko) Chrome/5.0.342.9 Safari/533.2"

urllib.request._urlopener = UndercoverURLopener()

def youtube_download(video_url):
    video_id = parse_qs(urlparse(video_url).query)['v'][0]

    url_data = urlopen('http://www.youtube.com/get_video_info?&video_id=' + video_id).read()
    url_info = parse_qs(unquote(url_data.decode('utf-8')))
    token_value = url_info['token'][0]

    download_url = "http://www.youtube.com/get_video?video_id={0}&t={1}&fmt=18".format(
        video_id, token_value)

    video_title = url_info['title'][0] if 'title' in url_info else ''
    # Unicode filenames are more trouble than they're worth
    filename = video_title.encode('ascii', 'ignore').decode('ascii').replace("/", "-") + '.mp4'

    print("	 Downloading '{}' to '{}'...".format(video_title, filename))

    try:
        download = urlopen(download_url).read()
        f = open(filename, 'wb')
        f.write(download)
        f.close()
    except Exception as e:
        print("	 Download failed! {}".format(str(e)))
        print("	 Skipping...")
    else:
        print("	 Done.")

def main():
    print("
--------------------------")
    print (" YouTube Video Downloader")
    print ("--------------------------
")

    try:
        video_urls = sys.argv[1:]
    except:
        video_urls = input('Enter (space-separated) video URLs: ')

    for u in video_urls:
        youtube_download(u)
    print("
 Done.")

if __name__ == '__main__':
    main()

3 revs, 2 users 79% · Answer

I'm going to shamelessly plug my script which automates checking for valid formats, automatically choosing the best quality format for a video, and works on both the Flash and HTML5 variants of YouTube pages (as well as Vimeo).

If you wrote that script then please look at my source code for inspiration and feel free to steal some code. I challenge you to please write something better. Open source thrives on competition!

However, if you copied that script and are just trying to get it working, may I suggest you give my script a try and see if it fares better for you. You can access it both from the command line as a script or even as a module in another Python file.

kenorb · Answer

You may also check youtube-dl which is written in Python and check how it's written.

Can’t download youtube video

Tags:

python

file

python-3.x

urllib

Kev

3 Answers

rbp

3 revs, 2 users 79%

kenorb

Recent Activity

Donate For Us

Can’t download youtube video

Tags:

python

file

python-3.x

urllib

Kev

3 Answers

rbp

3 revs, 2 users 79%

kenorb

Related questions

Recent Activity

Donate For Us