I have a table like so
Col1 Col2 Col3
A 1 word1
A 2 word2
A 3 word3
A 4 word4
B 1 word1
B 3 word3
And I want to group col2 and col3 by col1, but keep col2 and col3 in a map, like so:
Col1 map(col2, col3)
A [(1, word1), (2, word2), (3, word3), (4, word4)]
B [(1, word1), (3, word3)]
I know there is a way to do this with just an array, as appears here: Grouping hive rows in an array of this rows
But I'm wondering if this is possible with a map (key/value pairs).
Use the "collect" UDF in BrickHouse http://github.com/klout/brickhouse
select col1, collect( col2, col3 )
from mytable
group by col1
You can also merge maps with the "union_map" UDAF
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With