Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python requests: download only if newer

What is the standard pythonic way to download a new file from a server only if the server copy is newer than the local one?

Either my python-search-fu is very weak today, or one really does needs to roll their own date-time parser and comparer like below. Is there really no requests.header.get_datetime_object('last-modified')? or request.save_to_file(url, outfile, maintain_datetime=True)?

import requests
import datetime

r = requests.head(url)
url_time = r.headers['last-modified']
file_time = datetime.datetime.fromtimestamp(os.path.getmtime(dstFile))
print url_time  #emits 'Sat, 28 Mar 2015 08:05:42 GMT' on my machine
print file_time #emits '2015-03-27 21:53:28.175072' 

if time_is_older(url_time, file_time):
    print 'url modtime is not newer than local file, skipping download'
    return
else:
    do_download(url)
    os.utime(dstFile, url_time) # maintain server's file timestamp

def time_is_older(str_time, time_object):
    ''' Parse str_time and see if is older than time_object.
        This is a fragile function, what if str_time is in different locale?
    '''
    parsed_time = datetime.datetime.strptime(str_time, 
        #Fri, 27 Mar 2015 08:05:42 GMT
        '%a, %d %b %Y %X %Z')
    return parsed_time < time_object
like image 240
matt wilkie Avatar asked Mar 28 '15 06:03

matt wilkie


People also ask

Does requests get download file?

Requests is a versatile HTTP library in python with various applications. One of its applications is to download a file from web using the file URL.

Is requests a library or package?

Requests is a popular open source HTTP library that simplifies working with HTTP requests. The Requests library is available for both Python 2 and Python 3 from the Python Package Index (PyPI), and has the following features: Allows you to send HTTP/1.1 PUT, DELETE, HEAD, GET and OPTIONS requests with ease.

What does requests get () do?

The get() method sends a GET request to the specified url.


1 Answers

import requests
import datetime
from dateutil.parser import parse as parsedate
r = requests.head(url)
url_time = r.headers['last-modified']
url_date = parsedate(url_time)
file_time = datetime.datetime.fromtimestamp(os.path.getmtime(dstFile))
if url_date > file_time :
    download it !
like image 66
Sérgio Avatar answered Sep 21 '22 18:09

Sérgio