Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get full path to files in S3 using Boto3 nested keys

My bucket structure is as follows:

bucket
    production
        dt=2017-01-01
            file1.json
        ...
        dt=2017-05-01
            file2.json

What I'm looking to do is get the full path to file1.json, file2.json, so I can download them.

I'm struggling to do this is python. Any help is appreciated. TIA.

like image 686
mr-sk Avatar asked Dec 13 '22 21:12

mr-sk


2 Answers

s3 = boto3.client('s3')

You can list all objects by calling list_objects

objs = s3.list_objects(Bucket='mybucket')['Contents']

Using list comprehension, get the object names ignoring folders (which has a size of 0)

[obj['Key'] for obj in objs if obj['Size']]

Or:

s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket')
[key.key for key in bucket.objects.all() if key.size]

If you want to list the objects with certain prefix:

# S3 list all keys with the prefix 'photos/'
s3 = boto3.resource('s3')
bucket = s3.Bucket('production')
  for obj in bucket.objects.filter(Prefix='2017-01-01/'):
    if obj.size: print obj.key
like image 141
helloV Avatar answered Dec 16 '22 09:12

helloV


When a list of objects is retrieved from Amazon S3, they Key of the object is always its full path:

import boto3
s3 = boto3.resource('s3')
for key in bucket.objects.all():
  print key.key

Result:

production/dt=2017-01-01/file1.json
production/dt=2017-01-01/file2.json
production/dt=2017-05-01/file1.json
production/dt=2017-05-01/file2.json
like image 32
John Rotenstein Avatar answered Dec 16 '22 11:12

John Rotenstein