Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hive - Partition Column Equal to Current Date

Tags:

hive

hiveql

I am trying to insert into a Hive table from another table that does not have a column for todays date. The partition I am trying to create is at the date level. What I am trying to do is something like this:

INSERT OVERWRITE TABLE table_2_partition 
PARTITION (p_date = from_unixtime(unix_timestamp() - (86400*2) ,  'yyyy-MM-dd'))
SELECT * FROM table_1;

But when I run this I get the following error:

"cannot recognize input near 'from_unixtime' '(' 'unix_timestamp' in constant"

If I query a table and make one of the columns that it work just fine. Any idea how to set the partition date to current system date in HiveQL?

Thanks in advance,

Craig

like image 819
craigtb Avatar asked Sep 29 '22 15:09

craigtb


1 Answers

What you want here is Hive dynamic partitioning. This allows the decision for which partition each record is inserted into be determined dynamically as the record is selected. In your case, that decision is based on the date when you run the query.

To use dynamic partitioning your partition clause has the partition field(s) but not the value. The value(s) that maps to the partition field(s) is the value(s) at the end of the SELECT, and in the same order.

When you use dynamic partitions for all partition fields you need to ensure that you are using nonstrict for your dynamic partition mode (hive.exec.dynamic.partition.mode).

In your case, your query would look something like:

SET hive.exec.dynamic.partition.mode=nonstrict;

INSERT OVERWRITE TABLE table_2_partition 
PARTITION (p_date)
SELECT
    *
  , from_unixtime(unix_timestamp() - (86400*2) ,  'yyyy-MM-dd')
FROM table_1;
like image 67
Jeremy Beard Avatar answered Oct 05 '22 07:10

Jeremy Beard