Is there a way to serialize data in Apache Parquet format using C#, I can't find any implementation of that. In the oficial Parquet docs it is said that "Thrift can be also code-genned into any other thrift-supported language." but I'm not sure what this actually means.
Thanks
With the query results stored in a DataFrame, we can use petl to extract, transform, and load the Parquet data. In this example, we extract Parquet data, sort the data by the Column1 column, and load the data into a CSV file.
parquet data is always serialized using its own file format. this is why parquet can't read files serialized using avro's storage format, and vice-versa.
Though we literally don't convert from Parquet format to CSV straight, first we convert it to DataFrame and then DataFrame can be saved to any format Spark supports.
I have started an opensource project for .NET implementation of Apache Parquet, so anyone is welcome to join. https://github.com/aloneguid/parquet-dotnet
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With