Python - How to read CSV file retrieved from S3 bucket?

Tags:

There's a CSV file in a S3 bucket that I want to parse and turn into a dictionary in Python. Using Boto3, I called the s3.get_object(<bucket_name>, <key>) function and that returns a dictionary which includes a "Body" : StreamingBody() key-value pair that apparently contains the data I want.

In my python file, I've added import csv and the examples I see online on how to read a csv file, you pass the file name such as:

with open(<csv_file_name>, mode='r') as file:
reader = csv.reader(file)

However, I'm not sure how to retrieve the csv file name from StreamBody, if that's even possible. If not, is there a better way for me to read the csv file in Python? Thanks!

Edit: Wanted to add that I'm doing this in AWS Lambda and there are documented issues with using pandas in Lambda, so this is why I wanted to use the csv library and not pandas.

875

asked Oct 25 '17 22:10

Louis

2 Answers

csv.reader does not require a file. It can use anything that iterates through lines, including files and lists.

So you don't need a filename. Just pass the lines from response['Body'] directly into the reader. One way to do that is

lines = response['Body'].read().splitlines(True)
reader = csv.reader(lines)

119

answered Oct 17 '22 15:10

Aaron Bentley

To retrieve and read CSV file from s3 bucket, you can use the following code:

import csv
import boto3
from django.conf import settings

bucket_name = "your-bucket-name"
file_name = "your-file-name-exists-in-that-bucket.csv"

s3 = boto3.resource('s3', aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
                    aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY)

bucket = s3.Bucket(bucket_name)

obj = bucket.Object(key=file_name)

response = obj.get()
lines = response['Body'].read().decode('utf-8').splitlines(True)

reader = csv.DictReader(lines)
for row in reader:
    # csv_header_key is the header keys which you have defined in your csv header
    print(row['csv_header_key1'], row['csv_header_key2')

answered Oct 17 '22 17:10

Chirag Kalal

Related questions
                            
                                PermissionError: [WinError 5] Access is denied: 'C:\\Program Files\\Anaconda3\\pkgs\\vs2015_runtime-14.0.25123-0.tmp
                            
                                how to install geckodriver on a windows system
                            
                                How to convert a list of strings into a numeric numpy array?
                            
                                How to directly use Axes3D from matplotlib in standard plot to avoid flake8 error
                            
                                Different precision on matplotlib axis
                            
                                Python regex to remove emails from string
                            
                                Pandas dataframe to_csv - split into multiple output files
                            
                                how to type sudo password when using subprocess.call?
                            
                                How can I ask setup.py to list dependencies?
                            
                                Python3 tkinter set image size
                            
                                Set "secure" attribute for Flask cookies
                            
                                Facing obstacle to install pyodbc and pymssql in ubuntu 16.04
                            
                                Jinja2 reverse a list
                            
                                AttributeError: module 'numpy' has no attribute 'flip'
                            
                                Remove single occurrences of words in vocabulary TF-IDF
                            
                                How can we fetch IAM users, their groups and policies?
                            
                                Extracting dictionary items embedded in a list
                            
                                ERROR:tensorflow:Couldn't understand architecture name ''
                            
                                Find euclidean distance from a point to rows in pandas dataframe
                            
                                Setting variable in Jinja for loop doesn't persist between iterations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python - How to read CSV file retrieved from S3 bucket?

Tags:

python

csv

amazon-s3

Louis

People also ask

2 Answers

Aaron Bentley

Chirag Kalal

Recent Activity

Donate For Us