Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate Date Series in HIVE? (Creating table)

Tags:

hadoop

hive

Suppose I currently have a table that has 1 row for each account and the data in the tables are:

  1. Account Number
  2. Start Date
  3. End Date

Now I'd like to create a new table that has 1 row for each day the account is open, i.e. 1 day for each row between the start and end dates (inclusive) for each account.

E.g.

Table 1

Account Number    Start Date    End Date
123               1-Jan-17      1-Jul-17
456               1-Feb-17      4-May-17

Table 2 (Desired table)

Account Number    Day
123               1-Jan-17
123               1-Jan-17
     ...
123               1-Jul-17
456               1-Feb-17
456               2-Feb-17
     ...
456               4-May-17

I know in Postgresql there's a function called 'generate series' that would allow you to do that easily. I'm wondering if there's a similar function in HIVE that would allow you to do that as well?

Thanks!

like image 447
Sheng Chai Avatar asked Jul 24 '17 10:07

Sheng Chai


People also ask

Which function will return the number of days between two dates in Hive?

The DATEDIFF function returns the number of days between the two given dates.

What is CTAS in Hive?

Create Table as select (CTAS) is possible in Hive. You can try out below command: CREATE TABLE new_test row format delimited fields terminated by '|' STORED AS RCFile AS select * from source where col=1. Target cannot be partitioned table. Target cannot be external table. It copies the structure as well as the data.

Can we create table in Hive?

In Apache Hive we can create tables to store structured data so that later on we can process it. The table in the hive is consists of multiple columns and records. The table we create in any database will be stored in the sub-directory of that database.

How do I change the date column in Hive?

Correct command is: alter table table_name change col3 col3 date; The column change command will only modify Hive's metadata, and will not modify data.


1 Answers

select  t.AccountNumber
       ,date_add (t.StartDate,pe.i)   as Day

from    Table1 t
        lateral view 
        posexplode(split(space(datediff(t.EndDate,t.StartDate)),' ')) pe as i,x
like image 93
David דודו Markovitz Avatar answered Sep 24 '22 05:09

David דודו Markovitz