Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serialize parquet data with C#

Tags:

c#

apache

parquet

Is there a way to serialize data in Apache Parquet format using C#, I can't find any implementation of that. In the oficial Parquet docs it is said that "Thrift can be also code-genned into any other thrift-supported language." but I'm not sure what this actually means.

Thanks

like image 650
dhalfageme Avatar asked Nov 02 '16 12:11

dhalfageme


People also ask

How do I extract data from a Parquet file?

With the query results stored in a DataFrame, we can use petl to extract, transform, and load the Parquet data. In this example, we extract Parquet data, sort the data by the Column1 column, and load the data into a CSV file.

Is Parquet a serialization format?

parquet data is always serialized using its own file format. this is why parquet can't read files serialized using avro's storage format, and vice-versa.

Can we convert Parquet to CSV?

Though we literally don't convert from Parquet format to CSV straight, first we convert it to DataFrame and then DataFrame can be saved to any format Spark supports.


1 Answers

I have started an opensource project for .NET implementation of Apache Parquet, so anyone is welcome to join. https://github.com/aloneguid/parquet-dotnet

like image 126
Ivan G. Avatar answered Oct 25 '22 00:10

Ivan G.