I am currently using Cloudera 5.6 trying to create a parquet format table in hive table based off another table, but I am running into an error.
create table sfdc_opportunities_sandbox_parquet like
sfdc_opportunities_sandbox STORED AS PARQUET
Error Message
Parquet does not support date. See HIVE-6384
I read that hive 1.2 has a fix for this issue, but Cloudera 5.6 and 5.7 do not come with hive 1.2. Has anyone found way around this issue?
The DATE type is supported for Avro, HBase, Kudu, Parquet, and Text.
Parquet is supported by a plugin in Hive 0.10, 0.11, and 0.12 and natively in Hive 0.13 and later.
Parquet requires a Hive metastore version of 1.2 or above in order to use TIMESTAMP .
Parquet stores date as INT32 that stores the number of days from the Unix epoch, January 1, 1970.
Except from using an other data type like TIMESTAMP or an other storage format like ORC, there might be no way around if there is a dependency to the used Hive version and Parquet file storage format.
According Clouderas CDH 5 Packaging and Tarball Information, the whole branch 5 comes packed with Apache Parquet in v1.5.0 and Apache Hive in v1.1.0.
Date was implemented in ParquetSerde with HIVE-8119 and as of Hive 1.2.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With