Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download a folder from S3 using Boto3

Using Boto3 Python SDK, I was able to download files using the method bucket.download_file()

Is there a way to download an entire folder?

like image 350
El Fadel Anas Avatar asked Apr 11 '18 10:04

El Fadel Anas


People also ask

How can I download a folder from S3?

How to Download a Folder from AWS S3 # Use the s3 cp command with the --recursive parameter to download an S3 folder to your local file system. The s3 cp command takes the S3 source folder and the destination directory as inputs and downloads the folder.

How do I download from Amazon S3 using Java?

To download the file we need a file name which is a key to represent file in the S3 bucket. To implement this we are using Spring boot with aws-java-sdk-s3. Amazon S3 Java SDK provides a simple interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web.


2 Answers

quick and dirty but it works:

import boto3 import os   def downloadDirectoryFroms3(bucketName, remoteDirectoryName):     s3_resource = boto3.resource('s3')     bucket = s3_resource.Bucket(bucketName)      for obj in bucket.objects.filter(Prefix = remoteDirectoryName):         if not os.path.exists(os.path.dirname(obj.key)):             os.makedirs(os.path.dirname(obj.key))         bucket.download_file(obj.key, obj.key) # save to same path 

Assuming you want to download the directory foo/bar from s3 then the for-loop will iterate all the files whose path starts with the Prefix=foo/bar.

like image 123
Konstantinos Katsantonis Avatar answered Oct 14 '22 19:10

Konstantinos Katsantonis


A slightly less dirty modification of the accepted answer by Konstantinos Katsantonis:

import boto3 s3 = boto3.resource('s3') # assumes credentials & configuration are handled outside python in .aws directory or environment variables  def download_s3_folder(bucket_name, s3_folder, local_dir=None):     """     Download the contents of a folder directory     Args:         bucket_name: the name of the s3 bucket         s3_folder: the folder path in the s3 bucket         local_dir: a relative or absolute directory path in the local file system     """     bucket = s3.Bucket(bucket_name)     for obj in bucket.objects.filter(Prefix=s3_folder):         target = obj.key if local_dir is None \             else os.path.join(local_dir, os.path.relpath(obj.key, s3_folder))         if not os.path.exists(os.path.dirname(target)):             os.makedirs(os.path.dirname(target))         if obj.key[-1] == '/':             continue         bucket.download_file(obj.key, target) 

This downloads nested subdirectories, too. I was able to download a directory with over 3000 files in it. You'll find other solutions at Boto3 to download all files from a S3 Bucket, but I don't know if they're any better.

like image 34
bjc Avatar answered Oct 14 '22 17:10

bjc