Open S3 object as a string with Boto3

People also ask

Can I read S3 file without downloading?

Reading objects without downloading them Similarly, if you want to upload and read small pieces of textual data such as quotes, tweets, or news articles, you can do that using the S3 resource method put(), as demonstrated in the example below (Gist).

read will return bytes. At least for Python 3, if you want to return a string, you have to decode using the right encoding:

import boto3

s3 = boto3.resource('s3')

obj = s3.Object(bucket, key)
obj.get()['Body'].read().decode('utf-8')

I had a problem to read/parse the object from S3 because of .get() using Python 2.7 inside an AWS Lambda.

I added json to the example to show it became parsable :)

import boto3
import json

s3 = boto3.client('s3')

obj = s3.get_object(Bucket=bucket, Key=key)
j = json.loads(obj['Body'].read())

NOTE (for python 2.7): My object is all ascii, so I don't need .decode('utf-8')

NOTE (for python 3.6+): We moved to python 3.6 and discovered that read() now returns bytes so if you want to get a string out of it, you must use:

j = json.loads(obj['Body'].read().decode('utf-8'))

This isn't in the boto3 documentation. This worked for me:

object.get()["Body"].read()

object being an s3 object: http://boto3.readthedocs.org/en/latest/reference/services/s3.html#object

Python3 + Using boto3 API approach.

By using S3.Client.download_fileobj API and Python file-like object, S3 Object content can be retrieved to memory.

Since the retrieved content is bytes, in order to convert to str, it need to be decoded.

import io
import boto3

client = boto3.client('s3')
bytes_buffer = io.BytesIO()
client.download_fileobj(Bucket=bucket_name, Key=object_key, Fileobj=bytes_buffer)
byte_value = bytes_buffer.getvalue()
str_value = byte_value.decode() #python3, default decoding is utf-8

Related questions
                            
                                Counting array elements in Python [duplicate]
                            
                                How to sort Counter by value? - python
                            
                                How to load a tsv file into a Pandas DataFrame?
                            
                                How do I get a list of column names from a psycopg2 cursor?
                            
                                How to choose an AWS profile when using boto3 to connect to CloudFront
                            
                                How to erase the file contents of text file in Python?
                            
                                How to print a dictionary line by line in Python?
                            
                                APT command line interface-like yes/no input?
                            
                                How can I scroll a web page using selenium webdriver in python?
                            
                                Parse a .py file, read the AST, modify it, then write back the modified source code
                            
                                NumPy or Pandas: Keeping array type as integer while having a NaN value
                            
                                Import local function from a module housed in another directory with relative imports in Jupyter Notebook using Python 3
                            
                                What is `1..__truediv__` ? Does Python have a .. ("dot dot") notation syntax?
                            
                                Get last n lines of a file, similar to tail
                            
                                How to initialize weights in PyTorch?
                            
                                How to write to a file, using the logging Python module?
                            
                                Label axes on Seaborn Barplot
                            
                                How do you divide each element in a list by an int?
                            
                                Numpy first occurrence of value greater than existing value
                            
                                tqdm in Jupyter Notebook prints new progress bars repeatedly

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Open S3 object as a string with Boto3

Tags:

python

amazon-s3

boto3

boto

People also ask

Recent Activity

Donate For Us