pandas write dataframe to parquet format with append

Tags:

I am trying to write a pandas dataframe to parquet file format (introduced in most recent pandas version 0.21.0) in append mode. However, instead of appending to the existing file, the file is overwritten with new data. What am i missing?

the write syntax is

df.to_parquet(path, mode='append')

the read syntax is

pd.read_parquet(path)

604

asked Nov 08 '17 23:11

Siraj S.

2 Answers

To append, do this:

import pandas as pd 
import pyarrow.parquet as pq
import pyarrow as pa

dataframe = pd.read_csv('content.csv')
output = "/Users/myTable.parquet"

# Create a parquet table from your dataframe
table = pa.Table.from_pandas(dataframe)

# Write direct to your parquet file
pq.write_to_dataset(table , root_path=output)

This will automatically append into your table.

146

answered Sep 23 '22 16:09

Victor Faro

There is no append mode in pandas.to_parquet(). What you can do instead is read the existing file, change it, and write back to it overwriting it.

answered Sep 25 '22 16:09

ben26941

Related questions
                            
                                Swaping two elements in a list shows unexpected behaviour
                            
                                how to store worker-local variables in dask/distributed
                            
                                Why can I use a variable in a function before it is defined in Python?
                            
                                Python print floats padded with spaces instead of zeros
                            
                                Celery upgrade (3.1->4.1) - Connection reset by peer
                            
                                DJANGO_SETTINGS_MODULE not defined
                            
                                pandas-compat: 'import pandas' gives AttributeError: module 'pandas' has no attribute 'compat'
                            
                                Python pytest cases for async and await method
                            
                                why does my convolution routine differ from numpy & scipy's?
                            
                                Numpy dtype - data type not understood
                            
                                How to use Python 3 with Google App Engine's Local Development Server
                            
                                Keras images with no subfolders
                            
                                Why does PyQt crashes without information? (exit code 0xC0000409)
                            
                                dask apply: AttributeError: 'DataFrame' object has no attribute 'name'
                            
                                Cannot import multi_gpu_model from keras.utils
                            
                                AttributeError: module 'tensorflow' has no attribute 'feature_column'
                            
                                Prevent duplicates from itertools.permutations
                            
                                Using Keras, How can I load weights generated from CuDNNLSTM into LSTM Model?
                            
                                How to count correctly letters with diacritics in text?
                            
                                ansible yum not working

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas write dataframe to parquet format with append

Tags:

python

pandas

apache

parquet

Siraj S.

People also ask

2 Answers

Victor Faro

ben26941

Recent Activity

Donate For Us