Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

S3 boto list keys sometimes returns directory key

I've noticed a difference between the returns from boto's api depending on the bucket location. I have the following code:

con = S3Connection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
bucket = con.get_bucket(S3_BUCKET_NAME)
keys = bucket.list(path)
for key in keys:
  print key

which im running against two buckets, one in us-west and one in ireland. Path in this bucket is a sub-directory, against Ireland I get the sub directory and any keys underneath, against us-west I only get the keys beneath.

So Ireland gives:

<Key: <bucketName>,someDir/>
<Key: <bucketName>,someDir/someFile.jpg>
<Key: <bucketName>,someDir/someOtherFile.jpg>

where as US Standard gives:

<Key: <bucketName>,someDir/someFile.jpg>
<Key: <bucketName>,someDir/someOtherFile.jpg>

Obviously, I want to be able to write the same code regardless of bucket location. Anyone know of anything I can do to work around this so I get the same predictable results. Or even if it's boto causing the problem or S3. I noticed there is a different policy for naming buckets in Ireland, do different locals have their own version of the api's?

Thanks,

Steve

like image 657
Steven Franklin Avatar asked Mar 31 '12 09:03

Steven Franklin


People also ask

How can I tell if S3 object is file or directory?

With the method "endsWith("/")" you can detect if the S3ObjectSummary is a folder or not. Hope this can helps someone. works like wonders!

How do I find my S3 product key?

Sign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . In the Buckets list, choose the bucket that you want to enable an S3 Bucket Key for. Choose Properties. In the Default encryption section, under Bucket Key, you see the S3 Bucket Key setting for your bucket.

What is key in Boto?

wsdl" is the key.


2 Answers

Thanks to Steffen, who suggested looking at how the keys are created. With further investigation I think I've got a handle on whats happening here. My original suposition that it was linked to the bucket region was a red herring. It appears to be due to what the management console does when you manipulate keys.

If you create a directory in the management console it creates a 0 byte key. This will be returned when you perform a list.

If you use boto to create/upload a file then it doesn't create the folder. Interestingly, if you delete the file from within the folder (from the AWS console) then a key is created for the folder that used to contain the key. If you then upload the bey again using boto, then you have exactly the same looking structure from the UI, but infact you have a spurious additional key for the directory. This is what was happening to me, as I was testing our application I was clearing out keys and then finding different results.

Worth knowing this happens. There is no indicator in the UI to show if a folder is a created one (one that will be returned as a key) or an interpreted one (based on a keys name).

like image 110
Steven Franklin Avatar answered Sep 21 '22 06:09

Steven Franklin


I've had the same problem. As a work around you can filter out all the keys with a trailing '/' to eliminate the 'directory' entries.

def files(keys):
    return (key for key in keys if not key.name.endswith('/'))

s3 = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
bucket = s3.get_bucket(S3_BUCKET_NAME)
keys = bucket.list(path)
for key in files(keys):
    print(key)
like image 30
Daniel Canas Avatar answered Sep 22 '22 06:09

Daniel Canas