Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hive Lateral View Explode

Tags:

hive

I have data that has a nested mapping in one of the fields. For example, the data is in a table called 'customers' and looks like this:

Name: Bill Jones

Address: {"billing":{"street":"123 Main", "city":"Chicago", "state":"IL"},"shipping":{"street":"432 Copper", "city":"New York", "state":"NY"}}

What sort of query can I write to explode the entire dataset out?

I started with the following:

select name, key1, value1
from customers
lateral view explode(address) table1 as key1, value1

That got me part of the way there. It splits up the "billing" and "shipping" fields. However, I can't explode out the remaining. I tried this, but got an error message:

select name, key1, key2, value2
from customers
lateral view explode(address) table1 as key1, value1
lateral view explode(value1) table2 as key2, value2

I know I'm not doing it right, but not sure what the fix is?

Thanks,

Chetan

like image 284
Chetan Prabhu Avatar asked Mar 10 '23 19:03

Chetan Prabhu


2 Answers

Demo

create table customers (Name string, addresses map<string,struct<street1:string,street2:string,city:string,state:string>>);

insert into customers 
select  'Bill Jones'
       ,map
        (
            'billing'   ,named_struct('street1','123 Main'   ,'street2','' ,'city','Chicago'  ,'state','IL')
           ,'shipping'  ,named_struct('street1','432 Copper' ,'street2','' ,'city','New York' ,'state','NY')
        )
;

Option 1

select  name

       ,addresses['billing'].street1   as billing_street1 
       ,addresses['billing'].street2   as billing_street2
       ,addresses['billing'].city      as billing_city   
       ,addresses['billing'].state     as billing_state  
                                       
       ,addresses['shipping'].street1  as shipping_street1
       ,addresses['shipping'].street2  as shipping_street2
       ,addresses['shipping'].city     as shipping_city   
       ,addresses['shipping'].state    as shipping_state  

from    customers
;

+------------+-----------------+-----------------+--------------+---------------+------------------+------------------+---------------+----------------+
|    name    | billing_street1 | billing_street2 | billing_city | billing_state | shipping_street1 | shipping_street2 | shipping_city | shipping_state |
+------------+-----------------+-----------------+--------------+---------------+------------------+------------------+---------------+----------------+
| Bill Jones | 123 Main        |                 | Chicago      | IL            | 432 Copper       |                  | New York      | NY             |
+------------+-----------------+-----------------+--------------+---------------+------------------+------------------+---------------+----------------+        
        

Option 2

select  name 
       ,key             as address_type
       ,value.street1
       ,value.street2
       ,value.city
       ,value.state
        
from    customers
        lateral view explode(addresses) a  
;

+------------+--------------+------------+---------+----------+-------+
|    name    | address_type |  street1   | street2 |   city   | state |
+------------+--------------+------------+---------+----------+-------+
| Bill Jones | billing      | 123 Main   |         | Chicago  | IL    |
| Bill Jones | shipping     | 432 Copper |         | New York | NY    |
+------------+--------------+------------+---------+----------+-------+
    
like image 60
David דודו Markovitz Avatar answered Mar 23 '23 06:03

David דודו Markovitz


Based on your comments, you have the following structure

name (string), addresses (map>), email (string), phone (string), spend (int)

What you have here, is a map of string, structure not a map of maps, so you can query the data in the folllowing way

select name, key1, 
value1.street1,
value1.street2,
value1.city,
value1.state
from customers
lateral view explode(address) table1 as key1, value1
like image 40
hlagos Avatar answered Mar 23 '23 06:03

hlagos