I'm trying to download a file from S3 using boto, but only if a local copy of the file is older than the remote file.
I'm using the header 'If-Modified-Since' and the code below:
#!/usr/bin/python
import os
import datetime
import boto
from boto.s3.key import Key
bucket_name = 'my-bucket'
conn = boto.connect_s3()
bucket = conn.get_bucket(bucket_name)
def download(bucket, filename):
key = Key(bucket, filename)
headers = {}
if os.path.isfile(filename):
print "File exists, adding If-Modified-Since header"
modified_since = os.path.getmtime(filename)
timestamp = datetime.datetime.utcfromtimestamp(modified_since)
headers['If-Modified-Since'] = timestamp.strftime("%a, %d %b %Y %H:%M:%S GMT")
try:
key.get_contents_to_filename(filename, headers)
except boto.exception.S3ResponseError as e:
return 304
return 200
print download(bucket, 'README')
The problem is that when the local file does not exist everything works well and the file is downloaded. When I run the script for the second time my function returns 304 as expected, but the file that was previously downloaded is deleted.
You can download an object from an S3 bucket in any of the following ways: Select the object and choose Download or choose Download as from the Actions menu if you want to download the object to a specific folder. If you want to download a specific version of the object, select the Show versions button.
In the Amazon S3 console, choose your S3 bucket, choose the file that you want to open or download, choose Actions, and then choose Open or Download. If you are downloading an object, specify where you want to save it. The procedure for saving the object depends on the browser and operating system that you are using.
Reading objects without downloading them Similarly, if you want to upload and read small pieces of textual data such as quotes, tweets, or news articles, you can do that using the S3 resource method put(), as demonstrated in the example below (Gist).
boto.s3.key.Key.get_contents_to_filename
open file with wb
mode; it truncate the file at the beginning of the function (boto/s3/key.py). In addition to that, it removes the file when an exception raised.
Instead of get_contents_to_filename
, you can use get_contents_to_file
with different open mode.
def download(bucket, filename):
key = Key(bucket, filename)
headers = {}
mode = 'wb'
updating = False
if os.path.isfile(filename):
mode = 'r+b'
updating = True
print "File exists, adding If-Modified-Since header"
modified_since = os.path.getmtime(filename)
timestamp = datetime.datetime.utcfromtimestamp(modified_since)
headers['If-Modified-Since'] = timestamp.strftime("%a, %d %b %Y %H:%M:%S GMT")
try:
with open(filename, mode) as f:
key.get_contents_to_file(f, headers)
f.truncate()
except boto.exception.S3ResponseError as e:
if not updating:
# got an error and we are not updating an existing file
# delete the file that was created due to mode = 'wb'
os.remove(filename)
return e.status
return 200
NOTE file.truncate
is used to handle case where new file is smaller than previous one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With