Is there a way to append to a .feather format file using pd.to_feather?
I am also curious if anyone knows some of the limitations in terms of max file size, and whether it is possible to query for some specific data when you read a .feather file (such as read rows where date > '2017-03-31').
I love the idea of being able to store my dataframes and categorical data.
Feather is a fast, lightweight, and easy-to-use binary file format for storing data frames. It has a few specific design goals: Lightweight, minimal API: make pushing data frames in and out of memory as simple as possible. Language agnostic: Feather files are the same whether written by Python or R code.
Unfortunately, as both feather and parquet are columnar-oriented files. This means that you're not able to "append" as that's only possible in row-oriented file formats. Alternatives you could look into if you want to use parquet or feather is to partition the files. For example, if you have data that doesn't change, and is generated once per day, you can write and partition based on date. It does create some overhead when reading and writing out the file, but might be a better option than re-writing the entire file each time.
As it's columnar format, you're also not able to query and only read in rows where e.g. date>2017-01-01, what parquet excels at is that you're rather able to only read in the columns you need for your analysis.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With