How to save S3 object to a file using boto3

People also ask

How does S3 store data in Python?

Another option to upload files to s3 using python is to use the S3 resource class. Uploads file to S3 bucket using S3 resource object. This is useful when you are dealing with multiple buckets st same time. The above code will also upload files to S3.

There is a customization that went into Boto3 recently which helps with this (among other things). It is currently exposed on the low-level S3 client, and can be used like this:

s3_client = boto3.client('s3')
open('hello.txt').write('Hello, world!')

# Upload the file to S3
s3_client.upload_file('hello.txt', 'MyBucket', 'hello-remote.txt')

# Download the file from S3
s3_client.download_file('MyBucket', 'hello-remote.txt', 'hello2.txt')
print(open('hello2.txt').read())

These functions will automatically handle reading/writing files as well as doing multipart uploads in parallel for large files.

Note that s3_client.download_file won't create a directory. It can be created as pathlib.Path('/path/to/file.txt').parent.mkdir(parents=True, exist_ok=True).

boto3 now has a nicer interface than the client:

resource = boto3.resource('s3')
my_bucket = resource.Bucket('MyBucket')
my_bucket.download_file(key, local_filename)

This by itself isn't tremendously better than the client in the accepted answer (although the docs say that it does a better job retrying uploads and downloads on failure) but considering that resources are generally more ergonomic (for example, the s3 bucket and object resources are nicer than the client methods) this does allow you to stay at the resource layer without having to drop down.

Resources generally can be created in the same way as clients, and they take all or most of the same arguments and just forward them to their internal clients.

For those of you who would like to simulate the set_contents_from_string like boto2 methods, you can try

import boto3
from cStringIO import StringIO

s3c = boto3.client('s3')
contents = 'My string to save to S3 object'
target_bucket = 'hello-world.by.vor'
target_file = 'data/hello.txt'
fake_handle = StringIO(contents)

# notice if you do fake_handle.read() it reads like a file handle
s3c.put_object(Bucket=target_bucket, Key=target_file, Body=fake_handle.read())

For Python3:

In python3 both StringIO and cStringIO are gone. Use the StringIO import like:

from io import StringIO

To support both version:

try:
   from StringIO import StringIO
except ImportError:
   from io import StringIO

# Preface: File is json with contents: {'name': 'Android', 'status': 'ERROR'}

import boto3
import io

s3 = boto3.resource('s3')

obj = s3.Object('my-bucket', 'key-to-file.json')
data = io.BytesIO()
obj.download_fileobj(data)

# object is now a bytes string, Converting it to a dict:
new_dict = json.loads(data.getvalue().decode("utf-8"))

print(new_dict['status']) 
# Should print "Error"

If you wish to download a version of a file, you need to use get_object.

import boto3

bucket = 'bucketName'
prefix = 'path/to/file/'
filename = 'fileName.ext'

s3c = boto3.client('s3')
s3r = boto3.resource('s3')

if __name__ == '__main__':
    for version in s3r.Bucket(bucket).object_versions.filter(Prefix=prefix + filename):
        file = version.get()
        version_id = file.get('VersionId')
        obj = s3c.get_object(
            Bucket=bucket,
            Key=prefix + filename,
            VersionId=version_id,
        )
        with open(f"{filename}.{version_id}", 'wb') as f:
            for chunk in obj['Body'].iter_chunks(chunk_size=4096):
                f.write(chunk)

Ref: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/response.html

Note: I'm assuming you have configured authentication separately. Below code is to download the single object from the S3 bucket.

import boto3

#initiate s3 client 
s3 = boto3.resource('s3')

#Download object to the file    
s3.Bucket('mybucket').download_file('hello.txt', '/tmp/hello.txt')

Related questions
                            
                                Concatenate strings from several rows using Pandas groupby
                            
                                How to use Python requests to fake a browser visit a.k.a and generate User Agent?
                            
                                Sleeping in a batch file
                            
                                How can I enable CORS on Django REST Framework
                            
                                Timeout function if it takes too long to finish [duplicate]
                            
                                How do I calculate square root in Python?
                            
                                How to drop rows from pandas data frame that contains a particular string in a particular column? [duplicate]
                            
                                Understanding repr( ) function in Python
                            
                                How to convert list of numpy arrays into single numpy array?
                            
                                adding directory to sys.path /PYTHONPATH
                            
                                pip install: Please check the permissions and owner of that directory
                            
                                Can scrapy be used to scrape dynamic content from websites that are using AJAX?
                            
                                How do I run a Python script from C#?
                            
                                How to save traceback / sys.exc_info() values in a variable?
                            
                                How do I find the length (or dimensions, size) of a numpy matrix in python? [duplicate]
                            
                                How to create PDF files in Python [closed]
                            
                                Efficiently sorting a numpy array in descending order?
                            
                                Parse date string and change format
                            
                                How to capture stdout output from a Python function call?
                            
                                Removing numbers from string [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to save S3 object to a file using boto3

Tags:

python

amazon-web-services

boto3

boto

People also ask

Recent Activity

Donate For Us