Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save a data frame to S3 in feather format

Tags:

I have a data frame, let's say:

import pandas as pd
df = pd.DataFrame({'a': [1, 4], 'b': [1, 3]})

I want to save it as a feather file to s3 but I can't find a working way to do it.

I tried to use s3bp and s3fs but they don't do the trick.

Any suggestion?

like image 722
amarchin Avatar asked Feb 07 '18 13:02

amarchin


1 Answers

The solution that worked for me is

import boto3
import pandas as pd

from io import BytesIO
from pyarrow.feather import write_feather

df = pd.DataFrame({'a': [1, 4], 'b': [1, 3]})

s3_resource = boto3.resource('s3')
with BytesIO() as f:
    write_feather(df, f)
    s3_resource.Object('bucket-name', 'file_name').put(Body=f.getvalue())
like image 104
amarchin Avatar answered Sep 23 '22 13:09

amarchin