How to convert Parquet to CSV from a local file system (e.g. python, some library etc.) but WITHOUT Spark? (trying to find as simple and minimalistic solution as possible because need to automate everything and not much resources).
I tried with e.g. parquet-tools
on my Mac but data output did not look correct.
Need to make output so that when data is not present in some columns - CSV will have corresponding NULL (empty column between 2 commas)..
Thanks.
The Excel Add-In for Parquet provides the easiest way to connect with Apache Parquet data. Users simply supply their credentials via the connection wizard to create a connection and can immediately begin working with live Apache Parquet tables of data.
PyArrow includes Python bindings to this code, which thus enables reading and writing Parquet files with pandas as well.
You can do this by using the Python packages pandas
and pyarrow
(pyarrow
is an optional dependency of pandas
that you need for this feature).
import pandas as pd df = pd.read_parquet('filename.parquet') df.to_csv('filename.csv')
When you need to make modifications to the contents in the file, you can standard pandas
operations on df
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With