Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

s3- boto- list files within a bucket by upload time

I need to download every hour 100 newest files from s3 server.

bucketList = bucket.list(PREFIX)

The code above creates list of the files but it is not depend on the uploading time of the files, since it lists by file name?

I can do nothing with file name. It is given randomly.

Thanks.

like image 792
Ron D. Avatar asked Dec 16 '22 07:12

Ron D.


2 Answers

How big is the list? You could sort the list on the 'last_modified' attr of the Key

orderedList = sorted(bucketList, key=lambda k: k.last_modified)
keysYouWant = orderedList[0:100]

If your list is HUGE this may not be efficient. Check out the inline docs for the list() function in boto.s3.bucket.Bucket.

like image 53
kevinharvey Avatar answered Dec 28 '22 11:12

kevinharvey


My reading of List Objects operation documentation, suggests that objects are always listed in alphabetical order (by object key).

If you encode the creation time of each object into the object key, you may be able to achieve what you want.

like image 33
Pavel Repin Avatar answered Dec 28 '22 13:12

Pavel Repin