Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in parquet

Why can't I merge multiple parquet files using "cat file1.parquet file2. parquet > result.parquet"?

Speeding up PyArrow Parquet to Pandas for dataframe with lots of strings

python pandas parquet ray

Dask DataFrame.to_parquet fails on read - repartition - write operation

I always get a Kernel Dead when using "pd.read_parquet()". (No matter which file size)

Error when reading a parquet file with polars which was saved with pandas

How to read multiple .parquet files from multiple directories into single pandas dataframe?

pandas parquet

Writing many files to parquet from Spark - Missing some parquet files

Is there a way to create parquet file from xml/json input file without .avsc file and without impala/hive.?

parquet

spark reading missing columns in parquet

apache-spark parquet

Writing a Vec of Rows to a Parquet file

rust parquet apache-arrow

How do I stream parquet using pyarrow?

parquet pyarrow

Is there a tool to query Parquet files which are hosted in S3 storage?

Snowflake - how to read metadata from parquet files in S3

Dynamically create Hive external table with Avro schema on Parquet Data

hive avro parquet

Spark Dataset cache is using only one executor

Streaming parquet files from S3 (Python)

I can't convert df to parquet by data type error

Writing Parquet in Azure Blob Storage: "One of the request inputs is not valid"

Why reading a parquet dataset requires much more memory than the size of the dataset?

problem with reading partitioned parquet files created by Snowflake with pandas or arrow