Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boto script to download latest file from s3 bucket

I like to write a boto python script to download the recent most file from the s3 bucket i.e. for eg I have 100 files in a s3 bucket I need to download the recent most uploaded file in it.

Is there a way to download the recent most modified file from S3 using python boto.

like image 613
user1386776 Avatar asked Nov 05 '12 06:11

user1386776


2 Answers

You could list all of the files in the bucket and find the one with the most recent one (using the last_modified attribute).

>>> import boto
>>> c = boto.connect_s3()
>>> bucket = c.lookup('mybucketname')
>>> l = [(k.last_modified, k) for k in bucket]
>>> key_to_download = sorted(l, cmp=lambda x,y: cmp(x[0], y[0]))[-1][1]
>>> key_to_download.get_contents_to_filename('myfile')

Note, however, that this would be quite inefficient in you had lots of files in the bucket. In that case, you might want to consider using a database to keep track of the files and dates to make querying more efficient.

like image 110
garnaat Avatar answered Sep 28 '22 14:09

garnaat


To add to @garnaat's answer, you may be able to address the inefficiency by using prefix to reduce the matched files. Instead of c.lookup, this example would only search files in the subdir subbucket that start with file_2014_:

>>> import boto
>>> c = boto.connect_s3()
>>> bucket = c.get_bucket('mybucketname')
>>> bucket_files = bucket.list('subdir/file_2014_')
>>> l = [(k.last_modified, k) for k in bucket_files]
>>> key_to_download = sorted(l, cmp=lambda x,y: cmp(x[0], y[0]))[-1][1]
>>> key_to_download.get_contents_to_filename('target_filename')
like image 24
Scott Stafford Avatar answered Sep 28 '22 15:09

Scott Stafford