Two part question. I am trying to download multiple archived Cory Doctorow podcasts from the internet archive. The old one's that do not come into my iTunes feed. I have written the script but the downloaded files are not properly formatted.
Q1 - What do I change to download the zip mp3 files? Q2 - What is a better way to pass the variables into URL?
# and the base url. def dlfile(file_name,file_mode,base_url): from urllib2 import Request, urlopen, URLError, HTTPError #create the url and the request url = base_url + file_name + mid_url + file_name + end_url req = Request(url) # Open the url try: f = urlopen(req) print "downloading " + url # Open our local file for writing local_file = open(file_name, "wb" + file_mode) #Write to our local file local_file.write(f.read()) local_file.close() #handle errors except HTTPError, e: print "HTTP Error:",e.code , url except URLError, e: print "URL Error:",e.reason , url # Set the range var_range = range(150,153) # Iterate over image ranges for index in var_range: base_url = 'http://www.archive.org/download/Cory_Doctorow_Podcast_' mid_url = '/Cory_Doctorow_Podcast_' end_url = '_64kb_mp3.zip' #create file name based on known pattern file_name = str(index) dlfile(file_name,"wb",base_url
This script was adapted from here
Python wget download zip file One way to download a zip file from a URL in Python is to use the wget() function. But you need to install the wget library first using the pip command-line utility. Now you can use the wget library to download a zip file.
Simple urllib2 scripturlopen('http://python.org/') print "Response:", response # Get the URL. This gets the real URL. print "The URL is: ", response. geturl() # Getting the code print "This gets the code: ", response.
Here's how I'd deal with the url building and downloading. I'm making sure to name the file as the basename of the url (the last bit after the trailing slash) and I'm also using the with
clause for opening the file to write to. This uses a ContextManager which is nice because it will close that file when the block exits. In addition, I use a template to build the string for the url. urlopen
doesn't need a request object, just a string.
import os from urllib2 import urlopen, URLError, HTTPError def dlfile(url): # Open the url try: f = urlopen(url) print "downloading " + url # Open our local file for writing with open(os.path.basename(url), "wb") as local_file: local_file.write(f.read()) #handle errors except HTTPError, e: print "HTTP Error:", e.code, url except URLError, e: print "URL Error:", e.reason, url def main(): # Iterate over image ranges for index in range(150, 151): url = ("http://www.archive.org/download/" "Cory_Doctorow_Podcast_%d/" "Cory_Doctorow_Podcast_%d_64kb_mp3.zip" % (index, index)) dlfile(url) if __name__ == '__main__': main()
An older solution on SO along the lines of what you want:
download a zip file to a local drive and extract all files to a destination folder using python 2.5
Python and urllib
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With