My bucket structure is as follows:
bucket
production
dt=2017-01-01
file1.json
...
dt=2017-05-01
file2.json
What I'm looking to do is get the full path to file1.json, file2.json, so I can download them.
I'm struggling to do this is python. Any help is appreciated. TIA.
s3 = boto3.client('s3')
You can list all objects by calling list_objects
objs = s3.list_objects(Bucket='mybucket')['Contents']
Using list comprehension, get the object names ignoring folders (which has a size of 0)
[obj['Key'] for obj in objs if obj['Size']]
Or:
s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket')
[key.key for key in bucket.objects.all() if key.size]
If you want to list the objects with certain prefix:
# S3 list all keys with the prefix 'photos/'
s3 = boto3.resource('s3')
bucket = s3.Bucket('production')
for obj in bucket.objects.filter(Prefix='2017-01-01/'):
if obj.size: print obj.key
When a list of objects is retrieved from Amazon S3, they Key of the object is always its full path:
import boto3
s3 = boto3.resource('s3')
for key in bucket.objects.all():
print key.key
Result:
production/dt=2017-01-01/file1.json
production/dt=2017-01-01/file2.json
production/dt=2017-05-01/file1.json
production/dt=2017-05-01/file2.json
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With