Convert Pandas DataFrame to & from In-Memory Feather

Tags:

Using the IO tools in pandas it is possible to convert a DataFrame to an in-memory feather buffer:

import pandas as pd  
from io import BytesIO 

df = pd.DataFrame({'a': [1,2], 'b': [3.0,4.0]})  

buf = BytesIO()

df.to_feather(buf)

However, using the same buffer to convert back to a DataFrame

pd.read_feather(buf)

Results in an error:

ArrowInvalid: Not a feather file

How can a DataFrame be convert to an in-memory feather representation and, correspondingly, back to a DataFrame?

Thank you in advance for your consideration and response.

307

asked Jun 08 '18 13:06

Ramón J Romero y Vigil

1 Answers

With pandas==0.25.2 this can be accomplished in the following way:

import pandas
import io
df = pandas.DataFrame(data={'a': [1, 2], 'b': [3.0, 4.0]})
buf = io.BytesIO()
df.to_feather(buf)
output = pandas.read_feather(buf)

Then a call to output.head(2) returns:

    a    b
 0  1  3.0
 1  2  4.0

If you have a DataFrame with multiple indexes, you may see an error like

ValueError: feather does not support serializing for the index; you can .reset_index()to make the index into column(s)

In which case you need to call .reset_index() before to_feather, and call .set_index([...]) after read_feather

Last thing I would like to add, is that if you are doing something with the BytesIO, you need to seek back to 0 after writing the feather bytes. For example:

buffer = io.BytesIO()
df.reset_index(drop=False).to_feather(buffer)
buffer.seek(0)
s3_client.put_object(Body=buffer, Bucket='bucket', Key='file')

107

answered Sep 21 '22 14:09

luksfarris

Related questions
                            
                                ModuleNotFoundError issue for pytest
                            
                                Cryptacular is broken
                            
                                matplotlib 1.3.1 has requirement numpy>=1.5, but you'll have numpy 1.8.0rc1 which is incompatible
                            
                                Python: Remove duplicates for a specific item from list
                            
                                Why can a subprocess still write to stdout after it's been closed?
                            
                                python requests.get gets stuck
                            
                                Is tf.contrib.layers.fully_connected() behavior change between tensorflow 1.3 and 1.4 an issue?
                            
                                Updating an OpenCV tracker with a bounding box in python
                            
                                How to serialize numpy arrays?
                            
                                beautiful soup regex
                            
                                Check whether a DataFrame or ndrray contains digits
                            
                                How to pass global debug flag variable throughout my code; should I use argparse?
                            
                                worker_machine_type tag not working in Google Cloud Dataflow with python
                            
                                LSTM preprocessing: Build 3d arrays from pandas data frame based on ID
                            
                                How to update pip version installed by pyenv
                            
                                Upgrading SQLite3 version used in python3 on linux?
                            
                                Regarding GIL in python
                            
                                Python for .NET: How to explicitly create instances of C# classes using different versions of the same DLL?
                            
                                Containers communication with python requests
                            
                                Array comparison not matching elementwise comparison in numpy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Convert Pandas DataFrame to & from In-Memory Feather

Tags:

python

python-3.x

pandas

feather

apache-arrow

Ramón J Romero y Vigil

People also ask

1 Answers

luksfarris

Recent Activity

Donate For Us