Suppose I currently have a table that has 1 row for each account and the data in the tables are:
Now I'd like to create a new table that has 1 row for each day the account is open, i.e. 1 day for each row between the start and end dates (inclusive) for each account.
E.g.
Table 1
Account Number Start Date End Date
123 1-Jan-17 1-Jul-17
456 1-Feb-17 4-May-17
Table 2 (Desired table)
Account Number Day
123 1-Jan-17
123 1-Jan-17
...
123 1-Jul-17
456 1-Feb-17
456 2-Feb-17
...
456 4-May-17
I know in Postgresql there's a function called 'generate series' that would allow you to do that easily. I'm wondering if there's a similar function in HIVE that would allow you to do that as well?
Thanks!
The DATEDIFF function returns the number of days between the two given dates.
Create Table as select (CTAS) is possible in Hive. You can try out below command: CREATE TABLE new_test row format delimited fields terminated by '|' STORED AS RCFile AS select * from source where col=1. Target cannot be partitioned table. Target cannot be external table. It copies the structure as well as the data.
In Apache Hive we can create tables to store structured data so that later on we can process it. The table in the hive is consists of multiple columns and records. The table we create in any database will be stored in the sub-directory of that database.
Correct command is: alter table table_name change col3 col3 date; The column change command will only modify Hive's metadata, and will not modify data.
select t.AccountNumber
,date_add (t.StartDate,pe.i) as Day
from Table1 t
lateral view
posexplode(split(space(datediff(t.EndDate,t.StartDate)),' ')) pe as i,x
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With