I have a SNS notification setup that triggers a Lambda function when a .xlsx file is uploaded to S3 bucket. The lambda function reads the .xlsx file into Pandas DataFrame. <pre class="prettyprint"><code>import os import pandas as pd import json import xlrd import boto3 def main(event, context): message = event['Records'][0]['Sns']['Message'] parsed_message = json.loads(message) src_bucket = parsed_message['Records'][0]['s3']['bucket']['name'] filepath = parsed_message['Records'][0]['s3']['object']['key'] s3 = boto3.resource('s3') s3_client = boto3.client('s3') obj = s3_client.get_object(Bucket=src_bucket, Key=filepath) print(obj['Body']) df = pd.read_excel(obj, header=2) print(df.head(2)) </code></pre> I get an error as below: <pre class="prettyprint"><code>Invalid file path or buffer object type: <type 'dict'>: ValueError Traceback (most recent call last): File "/var/task/handler.py", line 26, in main df = pd.read_excel(obj, header=2) File "/var/task/pandas/util/_decorators.py", line 178, in wrapper return func(*args, **kwargs) File "/var/task/pandas/util/_decorators.py", line 178, in wrapper return func(*args, **kwargs) File "/var/task/pandas/io/excel.py", line 307, in read_excel io = ExcelFile(io, engine=engine) File "/var/task/pandas/io/excel.py", line 376, in __init__ io, _, _, _ = get_filepath_or_buffer(self._io) File "/var/task/pandas/io/common.py", line 218, in get_filepath_or_buffer raise ValueError(msg.format(_type=type(filepath_or_buffer))) ValueError: Invalid file path or buffer object type: <type 'dict'> </code></pre> How can I resolve this?

It is perfectly normal! obj is a dictionnary, have u tried ? <pre class="prettyprint"><code>df = pd.read_excel(obj['body'], header=2) </code></pre>

Pandas now supports s3 URL as a file path so it can read the excel file directly from s3 without downloading it first. See here for a CSV example - https://stackoverflow.com/a/51777553/52954

try <code>pd.read_excel(obj['Body'].read())</code>

Read excel file from S3 into Pandas DataFrame

Tags:

python

pandas

lambda

amazon-s3

I have a SNS notification setup that triggers a Lambda function when a .xlsx file is uploaded to S3 bucket.

The lambda function reads the .xlsx file into Pandas DataFrame.

import os 
import pandas as pd
import json
import xlrd
import boto3

def main(event, context):
    message = event['Records'][0]['Sns']['Message']
    parsed_message = json.loads(message)
    src_bucket = parsed_message['Records'][0]['s3']['bucket']['name']
    filepath = parsed_message['Records'][0]['s3']['object']['key']

    s3 = boto3.resource('s3')
    s3_client = boto3.client('s3')

    obj = s3_client.get_object(Bucket=src_bucket, Key=filepath)
    print(obj['Body'])

    df = pd.read_excel(obj, header=2)
    print(df.head(2))

I get an error as below:

Invalid file path or buffer object type: <type 'dict'>: ValueError
Traceback (most recent call last):
File "/var/task/handler.py", line 26, in main
df = pd.read_excel(obj, header=2)
File "/var/task/pandas/util/_decorators.py", line 178, in wrapper
return func(*args, **kwargs)
File "/var/task/pandas/util/_decorators.py", line 178, in wrapper
return func(*args, **kwargs)
File "/var/task/pandas/io/excel.py", line 307, in read_excel
io = ExcelFile(io, engine=engine)
File "/var/task/pandas/io/excel.py", line 376, in __init__
io, _, _, _ = get_filepath_or_buffer(self._io)
File "/var/task/pandas/io/common.py", line 218, in get_filepath_or_buffer
raise ValueError(msg.format(_type=type(filepath_or_buffer)))
ValueError: Invalid file path or buffer object type: <type 'dict'>

How can I resolve this?

601

asked Jan 14 '19 16:01

Raj

3 Answers

It is perfectly normal! obj is a dictionnary, have u tried ?

df = pd.read_excel(obj['body'], header=2)

answered Sep 30 '22 04:09

Tarik Elkalai

Pandas now supports s3 URL as a file path so it can read the excel file directly from s3 without downloading it first.

See here for a CSV example - https://stackoverflow.com/a/51777553/52954

answered Sep 30 '22 06:09

LiorH

try pd.read_excel(obj['Body'].read())

answered Sep 30 '22 04:09

Ritman Cronestar

Related questions
                            
                                Setting tick colors of matplotlib 3D plot
                            
                                Putting a python script into a docker container
                            
                                AWS lambda CLI 'update-function-code' does not update lambda_handler file
                            
                                How can I launch pyqt GUI multiple times consequtively in a process?
                            
                                How to normalize a non-normal distribution?
                            
                                DNNClassifier: 'DataFrame' object has no attribute 'dtype'
                            
                                Mark every Nth row per group using pandas
                            
                                Python generate a mask for the lower triangle of a matrix
                            
                                Pandas .str.replace and case insensitivity
                            
                                Generate 'K' Nearest Neighbours to a datapoint
                            
                                Create a tree from a given dictionary
                            
                                tensorflow sparse categorical cross entropy with logits
                            
                                What exactly defines a function in Python
                            
                                how to insert a element at specific index in python list
                            
                                ValueError: You are trying to load a weight file containing 6 layers into a model with 0
                            
                                how to return the order index of each element of a list? [duplicate]
                            
                                React Tutorial history map (step, move)
                            
                                pythonic style for functional programming
                            
                                Tensorflow: Different results with the same random seed
                            
                                Top N rows by group using python datatable

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With