I am working on a very huge dataset with 20 million+ records. I am trying to save all that data into a feathers format for faster access and also append as I proceed with me analysis.
Is there a way to append pandas dataframe to an existing feathers format file?
Feather files are intended to be written at once. Thus appending to them is not a supported use case.
Instead I would recommend to you for such a large dataset to write the data into individual Apache Parquet files using pyarrow.parquet.write_table
or pandas.DataFrame.to_parquet
and read the data also back into Pandas using pyarrow.parquet.ParquetDataset
or pandas.read_parquet
. These functions can treat a collection of Parquet files as a single dataset that is read at once into a single DataFrame.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With