Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Unpredictable memory error when downloading large files

I wrote a python script which I am using to download a large number of video files (50-400 MB each) from an HTTP server. It has worked well so far on long lists of downloads, but for some reason it rarely has a memory error.

The machine has about 1 GB of RAM free, but I don't think it's ever maxed out on RAM while running this script.

I've monitored the memory usage in the task manager and perfmon and it always behaves the same from what I've seen: slowly increases during the download, then returns to normal level after it finishes the download (There's no small leaks that creep up or anything like that).

The way the download behaves is that it creates the file, which remains at 0 KB until the download finishes (or the program crashes), then it writes the whole file at once and closes it.

for i in range(len(urls)):
    if os.path.exists(folderName + '/' + filenames[i] + '.mov'):
        print 'File exists, continuing.'
        continue

    # Request the download page
    req = urllib2.Request(urls[i], headers = headers)

    sock = urllib2.urlopen(req)
    responseHeaders = sock.headers
    body = sock.read()
    sock.close()

    # Search the page for the download URL
    tmp = body.find('/getfile/')
    downloadSuffix = body[tmp:body.find('"', tmp)]
    downloadUrl = domain + downloadSuffix

    req = urllib2.Request(downloadUrl, headers = headers)

    print '%s Downloading %s, file %i of %i'
        % (time.ctime(), filenames[i], i+1, len(urls))

    f = urllib2.urlopen(req)

    # Open our local file for writing, 'b' for binary file mode
    video_file = open(foldername + '/' + filenames[i] + '.mov', 'wb')

    # Write the downloaded data to the local file
    video_file.write(f.read()) ##### MemoryError: out of memory #####
    video_file.close()

    print '%s Download complete!' % (time.ctime())

    # Free up memory, in hopes of preventing memory errors
    del f
    del video_file

Here is the stack trace:

  File "downloadVideos.py", line 159, in <module>
    main()
  File "downloadVideos.py", line 136, in main
    video_file.write(f.read())
  File "c:\python27\lib\socket.py", line 358, in read
    buf.write(data)
MemoryError: out of memory
like image 962
Tim R. Avatar asked Mar 30 '11 21:03

Tim R.


1 Answers

Your problem is here: f.read(). That line attempts to download the entire file into memory. Instead of that, read in chunks (chunk = f.read(4096)), and save the pieces to temporary file.

like image 195
bradley.ayers Avatar answered Oct 06 '22 08:10

bradley.ayers