Using the IO tools in pandas it is possible to convert a DataFrame
to an in-memory feather buffer:
import pandas as pd
from io import BytesIO
df = pd.DataFrame({'a': [1,2], 'b': [3.0,4.0]})
buf = BytesIO()
df.to_feather(buf)
However, using the same buffer to convert back to a DataFrame
pd.read_feather(buf)
Results in an error:
ArrowInvalid: Not a feather file
How can a DataFrame be convert to an in-memory feather representation and, correspondingly, back to a DataFrame?
Thank you in advance for your consideration and response.
By using pandas. DataFrame. to_csv() method you can write/save/export a pandas DataFrame to CSV File. By default to_csv() method export DataFrame to a CSV file with comma delimiter and row index as the first column.
At times, you may need to convert your pandas dataframe to List. To accomplish this task, ' tolist() ' function can be used.
With pandas==0.25.2
this can be accomplished in the following way:
import pandas
import io
df = pandas.DataFrame(data={'a': [1, 2], 'b': [3.0, 4.0]})
buf = io.BytesIO()
df.to_feather(buf)
output = pandas.read_feather(buf)
Then a call to output.head(2)
returns:
a b
0 1 3.0
1 2 4.0
If you have a DataFrame
with multiple indexes, you may see an error like
ValueError: feather does not support serializing for the index; you can .reset_index()to make the index into column(s)
In which case you need to call .reset_index()
before to_feather
, and call .set_index([...])
after read_feather
Last thing I would like to add, is that if you are doing something with the BytesIO
, you need to seek back to 0 after writing the feather bytes. For example:
buffer = io.BytesIO()
df.reset_index(drop=False).to_feather(buffer)
buffer.seek(0)
s3_client.put_object(Body=buffer, Bucket='bucket', Key='file')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With