Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read Excel from S3 - AttributeError: 'StreamingBody' object has no attribute 'seek'

I have a python script which reads an excel file from S3 but getting an error when it's triggered in AWS Batch. The code works fine on another Ubuntu box.

AttributeError: 'StreamingBody' object has no attribute 'seek'

Section of my code to read the excel is below

import boto3
import pandas as pd    
session = boto3.Session(aws_access_key_id = config.access_key_id, aws_secret_access_key = config.secret_access_key)
client = session.client('s3') 
obj = client.get_object(Bucket = s3_bucket, Key = s3_file)    
df = pd.read_excel(obj['Body'],sheet_name=sheet_name, skiprows=1)

Any help is much appreciated.

like image 642
mtryingtocode Avatar asked Sep 06 '19 03:09

mtryingtocode


1 Answers

It seems like read_excel has changed the requirements for the "file like" object passed in, and this object now has to have a seek method. I solved this by changing pd.read_excel(obj['Body']) to pd.read_excel(io.BytesIO(file_obj['Body'].read()))

like image 192
Rory Avatar answered Nov 11 '22 18:11

Rory