Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hive - grouping rows into map

I have a table like so

Col1   Col2   Col3
A      1      word1
A      2      word2
A      3      word3
A      4      word4
B      1      word1
B      3      word3

And I want to group col2 and col3 by col1, but keep col2 and col3 in a map, like so:

Col1   map(col2, col3)
A      [(1, word1), (2, word2), (3, word3), (4, word4)]
B      [(1, word1), (3, word3)]

I know there is a way to do this with just an array, as appears here: Grouping hive rows in an array of this rows

But I'm wondering if this is possible with a map (key/value pairs).

like image 730
maia Avatar asked Oct 21 '22 19:10

maia


1 Answers

Use the "collect" UDF in BrickHouse http://github.com/klout/brickhouse

select col1, collect( col2, col3 )
from mytable
group by col1

You can also merge maps with the "union_map" UDAF

like image 119
Jerome Banks Avatar answered Oct 24 '22 01:10

Jerome Banks