Using Boto3 Python SDK
, I was able to download files using the method bucket.download_file()
Is there a way to download an entire folder?
How to Download a Folder from AWS S3 # Use the s3 cp command with the --recursive parameter to download an S3 folder to your local file system. The s3 cp command takes the S3 source folder and the destination directory as inputs and downloads the folder.
To download the file we need a file name which is a key to represent file in the S3 bucket. To implement this we are using Spring boot with aws-java-sdk-s3. Amazon S3 Java SDK provides a simple interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web.
quick and dirty but it works:
import boto3 import os def downloadDirectoryFroms3(bucketName, remoteDirectoryName): s3_resource = boto3.resource('s3') bucket = s3_resource.Bucket(bucketName) for obj in bucket.objects.filter(Prefix = remoteDirectoryName): if not os.path.exists(os.path.dirname(obj.key)): os.makedirs(os.path.dirname(obj.key)) bucket.download_file(obj.key, obj.key) # save to same path
Assuming you want to download the directory foo/bar from s3 then the for-loop will iterate all the files whose path starts with the Prefix=foo/bar.
A slightly less dirty modification of the accepted answer by Konstantinos Katsantonis:
import boto3 s3 = boto3.resource('s3') # assumes credentials & configuration are handled outside python in .aws directory or environment variables def download_s3_folder(bucket_name, s3_folder, local_dir=None): """ Download the contents of a folder directory Args: bucket_name: the name of the s3 bucket s3_folder: the folder path in the s3 bucket local_dir: a relative or absolute directory path in the local file system """ bucket = s3.Bucket(bucket_name) for obj in bucket.objects.filter(Prefix=s3_folder): target = obj.key if local_dir is None \ else os.path.join(local_dir, os.path.relpath(obj.key, s3_folder)) if not os.path.exists(os.path.dirname(target)): os.makedirs(os.path.dirname(target)) if obj.key[-1] == '/': continue bucket.download_file(obj.key, target)
This downloads nested subdirectories, too. I was able to download a directory with over 3000 files in it. You'll find other solutions at Boto3 to download all files from a S3 Bucket, but I don't know if they're any better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With