Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cloudera 5.6: Parquet does not support date. See HIVE-6384

I am currently using Cloudera 5.6 trying to create a parquet format table in hive table based off another table, but I am running into an error.

create table sfdc_opportunities_sandbox_parquet like 
sfdc_opportunities_sandbox STORED AS PARQUET

Error Message

Parquet does not support date. See HIVE-6384

I read that hive 1.2 has a fix for this issue, but Cloudera 5.6 and 5.7 do not come with hive 1.2. Has anyone found way around this issue?

like image 545
pitchblack408 Avatar asked May 20 '16 22:05

pitchblack408


People also ask

Does Parquet support date data type?

The DATE type is supported for Avro, HBase, Kudu, Parquet, and Text.

Does Hive work with Parquet?

Parquet is supported by a plugin in Hive 0.10, 0.11, and 0.12 and natively in Hive 0.13 and later.

Does Parquet support timestamp?

Parquet requires a Hive metastore version of 1.2 or above in order to use TIMESTAMP .

How is date stored in Parquet?

Parquet stores date as INT32 that stores the number of days from the Unix epoch, January 1, 1970.


1 Answers

Except from using an other data type like TIMESTAMP or an other storage format like ORC, there might be no way around if there is a dependency to the used Hive version and Parquet file storage format.

According Clouderas CDH 5 Packaging and Tarball Information, the whole branch 5 comes packed with Apache Parquet in v1.5.0 and Apache Hive in v1.1.0.

Date was implemented in ParquetSerde with HIVE-8119 and as of Hive 1.2.

like image 66
U880D Avatar answered Sep 20 '22 13:09

U880D