I like to write a boto python script to download the recent most file from the s3 bucket i.e. for eg I have 100 files in a s3 bucket I need to download the recent most uploaded file in it.
Is there a way to download the recent most modified file from S3 using python boto.
You could list all of the files in the bucket and find the one with the most recent one (using the last_modified attribute).
>>> import boto
>>> c = boto.connect_s3()
>>> bucket = c.lookup('mybucketname')
>>> l = [(k.last_modified, k) for k in bucket]
>>> key_to_download = sorted(l, cmp=lambda x,y: cmp(x[0], y[0]))[-1][1]
>>> key_to_download.get_contents_to_filename('myfile')
Note, however, that this would be quite inefficient in you had lots of files in the bucket. In that case, you might want to consider using a database to keep track of the files and dates to make querying more efficient.
To add to @garnaat's answer, you may be able to address the inefficiency by using prefix
to reduce the matched files. Instead of c.lookup
, this example would only search files in the subdir
subbucket that start with file_2014_
:
>>> import boto
>>> c = boto.connect_s3()
>>> bucket = c.get_bucket('mybucketname')
>>> bucket_files = bucket.list('subdir/file_2014_')
>>> l = [(k.last_modified, k) for k in bucket_files]
>>> key_to_download = sorted(l, cmp=lambda x,y: cmp(x[0], y[0]))[-1][1]
>>> key_to_download.get_contents_to_filename('target_filename')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With