Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Google BigQuery supports Parquet file format?

I was wondering if Google BigQuery currently supports Parquet file format or if there are plans to support it?

I know that it currently supports CSV and JSON formats.

like image 925
YABADABADOU Avatar asked Oct 27 '15 13:10

YABADABADOU


People also ask

How do I read a parquet file in BigQuery?

At this time BigQuery does not support Parquet file format.

What file format does BigQuery use?

BigQuery supports UTF-8 encoding for both nested or repeated and flat data. BigQuery supports ISO-8859-1 encoding for flat data only for CSV files.

Where is parquet file format used?

Parquet is optimized to work with complex data in bulk and features different ways for efficient data compression and encoding types. This approach is best especially for those queries that need to read certain columns from a large table. Parquet can only read the needed columns therefore greatly minimizing the IO.

What type of database is Google BigQuery?

BigQuery is a fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence.


1 Answers

** As of 1st March 2018, Support for loading Parquet 1.0 files is available.

In the BigQuery CLI, there is --source_format PARQUET option which is described in output of bq --help.

I never got to use it, because when I was experimenting with this feature, it was still invite-only, and I did not request the invite.

My usecase was that the Parquet file is half the size of the Avro file. I wanted to try something new and upload data efficiently (in this order).

% bq load --source_format PARQUET test.test3 data.avro.parquet schema.json 
Upload complete.
Waiting on bqjob_r5b8a2b16d964eef7_0000015b0690a06a_1 ... (0s) Current 
status: DONE   
[...]
like image 169
user7610 Avatar answered Sep 30 '22 17:09

user7610