Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Store Excel file exported from Pandas in AWS

I'm making a small website using Flask, with a SQLite database. One of the things I want to do is take some data (from the database) and export it as an Excel file. I want to offer an option of downloading that Excel file. One option to do this is to use Pandas to write to an Excel file which would be stored on the web server, and to use Flask's send_file to offer the download option.

However, is it possible to provide a downloadable Excel file without storing the file "locally" on the server? For example on AWS S3. I want to have predictable storage size on the web server. (And just see if it's possible, in any case.)

One option might be to write to a file "locally", then send it to AWS, then delete it from the server. Ideally I'd rather capture the file stream directly and then send that to S3, but I don't think that's possible, since to_excel only takes a file path (or an ExcelWriter object, but that takes a file path).

like image 645
Vegard Stikbakke Avatar asked Feb 25 '19 09:02

Vegard Stikbakke


2 Answers

To add to balderman's answer, the complete code for getting it to S3 would be

import io
import pandas as pd
import boto3

# ...

# make data frame 'df'

with io.BytesIO() as output:
  with pd.ExcelWriter(output, engine='xlsxwriter') as writer:
    df.to_excel(writer)
  data = output.getvalue()

s3 = boto3.resource('s3')
s3.Bucket('my-bucket').put_object(Key='data.xlsx', Body=data)

See also the XlsxWriter documentation.

like image 180
Vegard Stikbakke Avatar answered Oct 13 '22 20:10

Vegard Stikbakke


Taken from here: Write to StringIO object using Pandas Excelwriter?

You can dump the 'output' to S3

# Note, Python 2 example. For Python 3 use: output = io.BytesIO().
output = StringIO.StringIO()

# Use the StringIO object as the filehandle.
writer = pd.ExcelWriter(output, engine='xlsxwriter')
like image 23
balderman Avatar answered Oct 13 '22 22:10

balderman