Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explode the Array of Struct in Hive

This is the below Hive Table

CREATE EXTERNAL TABLE IF NOT EXISTS SampleTable ( USER_ID BIGINT, NEW_ITEM ARRAY<STRUCT<PRODUCT_ID: BIGINT,TIMESTAMPS:STRING>> ) 

And this is the data in the above table-

1015826235     [{"product_id":220003038067,"timestamps":"1340321132000"},{"product_id":300003861266,"timestamps":"1340271857000"}] 

Is there any way I can get the below output from the HiveQL after exploding the array?

**USER_ID**  |  **PRODUCT_ID**  |   **TIMESTAMPS**  ------------+------------------+---------------- 1015826235      220003038067       1340321132000 1015826235      300003861266       1340271857000 

Updated

I wrote this query to get the output in the above format, but it is not giving me the result in the way I wanted to.

SELECT myTable1.myCol1,myTable2.myCol2 FROM sampletable st LATERAL VIEW  explode(st.purchased_item.product_id) myTable1 AS myCol1 LATERAL VIEW  explode(st.purchased_item.timestamps) myTable2 AS myCol2; 

Can anyone help me what wrong I am doing? Any suggestions will be appreciated.

like image 449
arsenal Avatar asked Jul 07 '12 08:07

arsenal


People also ask

How do you access an array of structs in Hive?

Structs: Data within structs can be accessed through DOT (.). For example, the type of column c in the table is STRUCT{a INT; b INT}, and we can access domain a through c.a. Maps (K-V pairs): Access to a specified domain can be done by ["designated domain name"].

How do you access struct elements in Hive?

We can access the elements in STRUCT type using DOT (.) notation. Example: For a column c3 of type STRUCT {c1 INTEGER; c2 INTEGER}, the c1 field is accessed by the expression c3.


2 Answers

You need to explode only once (in conjunction with LATERAL VIEW). After exploding you can use a new column (called prod_and_ts in my example) which will be of struct type. Then, you can resolve the product_id and timestamps members of this new struct column to retrieve the desired result.

SELECT    user_id,    prod_and_ts.product_id as product_id,    prod_and_ts.timestamps as timestamps FROM     SampleTable     LATERAL VIEW explode(new_item) exploded_table as prod_and_ts; 
like image 166
Mark Grover Avatar answered Oct 12 '22 13:10

Mark Grover


If you are on Hive 0.10 or later, you could also use inline(ARRAY<STRUCT[,STRUCT]>). It explodes an array of structs into a table.

like image 38
Tariq Avatar answered Oct 12 '22 11:10

Tariq