Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download file using partial download (HTTP)

Tags:

Is there a way to download huge and still growing file over HTTP using the partial-download feature?

It seems that this code downloads file from scratch every time it executed:

import urllib urllib.urlretrieve ("http://www.example.com/huge-growing-file", "huge-growing-file") 

I'd like:

  1. To fetch just the newly-written data
  2. Download from scratch only if the source file becomes smaller (for example it has been rotated).
like image 959
Konstantin Avatar asked Nov 25 '09 18:11

Konstantin


Video Answer


1 Answers

It is possible to do partial download using the range header, the following will request a selected range of bytes:

req = urllib2.Request('http://www.python.org/') req.headers['Range'] = 'bytes=%s-%s' % (start, end) f = urllib2.urlopen(req) 

For example:

>>> req = urllib2.Request('http://www.python.org/') >>> req.headers['Range'] = 'bytes=%s-%s' % (100, 150) >>> f = urllib2.urlopen(req) >>> f.read() 'l1-transitional.dtd">\n\n\n<html xmlns="http://www.w3.' 

Using this header you can resume partial downloads. In your case all you have to do is to keep track of already downloaded size and request a new range.

Keep in mind that the server need to accept this header for this to work.

like image 172
Nadia Alramli Avatar answered Oct 13 '22 23:10

Nadia Alramli