Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boto S3 API does not return full list of keys

I use boto S3 API in my python script which slowly copies data from S3 to my local filesystem. The script worked well for a couple of days, but now there is a problem.

I use the following API function to obtain the list of keys in "directory":

keys = bucket.get_all_keys(prefix=dirname)

And this function (get_all_keys) does not always return the full list of keys, I mean I can see more keys through AWS web-interface or via aws s3 ls s3://path.

Reproduced the issue on versions 2.15 and 2.30.

Maybe boto cached some of my requests to S3 (I repeat same requests over and over again)? How to resolve this issue, any suggestions?

like image 567
Aleksei Petrenko Avatar asked Jul 08 '14 14:07

Aleksei Petrenko


2 Answers

There is an easier way. The Bucket object itself can act as an iterator and it knows how to handle paginated responses. So, if there are more results available, it will automatically fetch them behind the scenes. So, something like this should allow you to iterate over all of the objects in your bucket:

for key in bucket:
    # do something with your key

If you want to specify a prefix and get a listing of all keys starting with that prefix, you can do it like this:

for key in bucket.list(prefix='foobar'):
    # do something with your key

Or, if you really, really want to build up a list of objects, just do this:

keys = [k for k in bucket]

Note, however, that buckets can hold an unlimited number of keys so be careful with this because it will build a list of all keys in memory.

like image 195
garnaat Avatar answered Oct 03 '22 23:10

garnaat


Just managed to get it working! It turned out that I had 1013 keys in my directory on S3 and get_all_keys can return only 1000 keys due to AWS API restrictions.

The solution is simple, just use more high-level function without delimiter parameter:

keys = list(bucket.list(prefix=dirname))
like image 26
Aleksei Petrenko Avatar answered Oct 03 '22 22:10

Aleksei Petrenko