Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a table in Hive with a column of data type array<map<string, string>>

I am trying to create a table which has a complex data type. And the data types are listed below.

  1. array

  2. map

  3. array< map < String,String> >

I am trying to create a data structure of 3 type . Is it ever possible to create in Hive? My table DDL looks like below.

create table complexTest(names array<String>,infoMap map<String,String>, deatils array<map<String,String>>)           
row format delimited                                                                                       
fields terminated by '/'                                                                                   
collection items terminated by '|'                                                                         
map keys terminated by '='                                                                                 
lines terminated by '\n';

And my sample data looks like below.

Abhieet|Test|Complex/Name=abhi|age=31|Sex=male/Name=Test,age=30,Sex=male|Name=Complex,age=30,Sex=female

Whever i am querying the data from the table i am getting the below values

["Abhieet"," Test"," Complex"]  {"Name":"abhi","age":"31","Sex":"male"} [{"Name":null,"Test,age":null,"31,Sex":null,"male":null},{"Name":null,"Complex,age":null,"30,Sex":null,"female":null}]

Which is not i am expecting. Could you please help me to find out what should be the DDL if it ever possible for data type array< map < String,String>>

like image 604
Abhijeet Chaudhury Avatar asked Jun 10 '15 00:06

Abhijeet Chaudhury


2 Answers

I don't think this is possible using the inbuilt serde. If you know in advance what the values in your maps are going to be, then I think a better way of approaching this would be to convert your input data to JSON, and then use the Hive json serde:

Sample data:

{'Name': ['Abhieet', 'Test', 'Complex'],
'infoMap': {'Sex': 'male', 'Name': 'abhi', 'age': '31'},
 'details': [{'Sex': 'male', 'Name': 'Test', 'age': '30'}, {'Sex': 'female', 'Name': 'Complex', 'age': '30'}]
 }

Table definition code:

create table complexTest
(
names array<string>,
infomap struct<Name:string,
               age:string,
               Sex:string>,
details array<struct<Name:string,
               age:string,
               Sex:string>>
)
row format serde 'org.openx.data.jsonserde.JsonSerDe'
like image 190
maxymoo Avatar answered Nov 15 '22 07:11

maxymoo


This can be handled with array of structs using the following query.

create table complexStructArray(custID String,nameValuePairs array<struct< key:String, value:String>>) row format delimited fields terminated by '/' collection items terminated by '|' map keys terminated by '=' lines terminated by '\n';

Sample data:

101/Name=Madhavan|age=30

102/Name=Ramkumar|age=31

Though struct allows duplicate key values unlike Map, above query should handle the ask if the data is having unique key values.

select query would give the output as follows.

hive> select * from complexStructArray;

101 [{"key":"Name","value":"Madhavan"},{"key":"age","value":"30"}]

102 [{"key":"Name","value":"Ramkumar"},{"key":"age","value":"31"}]

like image 31
Madhavan Avatar answered Nov 15 '22 08:11

Madhavan