Can you append to a .feather format?

Tags:

feather

Is there a way to append to a .feather format file using pd.to_feather?

I am also curious if anyone knows some of the limitations in terms of max file size, and whether it is possible to query for some specific data when you read a .feather file (such as read rows where date > '2017-03-31').

I love the idea of being able to store my dataframes and categorical data.

325

asked Jun 17 '17 18:06

trench

1 Answers

Unfortunately, as both feather and parquet are columnar-oriented files. This means that you're not able to "append" as that's only possible in row-oriented file formats. Alternatives you could look into if you want to use parquet or feather is to partition the files. For example, if you have data that doesn't change, and is generated once per day, you can write and partition based on date. It does create some overhead when reading and writing out the file, but might be a better option than re-writing the entire file each time.

As it's columnar format, you're also not able to query and only read in rows where e.g. date>2017-01-01, what parquet excels at is that you're rather able to only read in the columns you need for your analysis.

125

answered Oct 05 '22 13:10

Pureluck

Related questions
                            
                                Iterate over first N rows in pandas
                            
                                Pandas: Timestamp index rounding to the nearest 5th minute
                            
                                Convert datetime.datetime object to days since epoch in Python
                            
                                renaming columns after group by and sum in pandas dataframe
                            
                                Finding top 10 in a dataframe in Pandas
                            
                                Mapping column names to random forest feature importances
                            
                                How to convert Numpy array to Panda DataFrame
                            
                                rename elements in a column of a data frame using pandas
                            
                                how to understand closed and label arguments in pandas resample method?
                            
                                Pandas: compare list objects in Series
                            
                                Python: create a new column from existing columns
                            
                                Pandas: Selecting rows based on value counts of a particular column
                            
                                Plot all pandas dataframe columns separately
                            
                                Unpredictable pandas slice assignment behavior with no SettingWithCopyWarning
                            
                                Why is `pandas.read_csv` not the reciprocal of `pandas.DataFrame.to_csv`?
                            
                                On the float_precision argument to pandas.read_csv
                            
                                Why is dask read_csv from s3 keeping so much memory?
                            
                                Read and reverse data chunk by chunk from a csv file and copy to a new csv file
                            
                                Add multiple text labels from DataFrame columns in Plotly

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With