Say my table looks like this:
Name,Subject,Score
Jon,English,80
Amy,Geography,70
Matt,English,90
Jon,Math,100
Jon,History,60
Amy,French,90
Is there a way to use collect_list so that I can get my query as such:
Jon: English:80; Math:100; History:60
Amy: Geography:70; French:90
Matt: English:90
EDIT:
The complication here is that the collect_list UDF permits only one argument, i.e. one column.
Something like
SELECT name, collect_list(subject), collect_list(score) from mytable group by name
results in
Jon | [English,Math,History] | [80,100,60]
Amy | [Geography,French] | [70,90]
Matt | [English] | [90]
Not sure if this is what you needed.
select * from t0;
+-------+------------+-------+--+
| t0.a | t0.b | t0.c |
+-------+------------+-------+--+
| Jon | English | 80 |
| Amy | Geography | 70 |
| Matt | English | 90 |
| Jon | Math | 100 |
| Jon | History | 60 |
| Amy | French | 90 |
+-------+------------+-------+--+
select a, collect_list(concat_ws(':',b,cast(c as string))) from t0 group by a;
+-------+-----------------------------------------+--+
| a | _c1 |
+-------+-----------------------------------------+--+
| Amy | ["Geography:70","French:90"] |
| Jon | ["English:80","Math:100","History:60"] |
| Matt | ["English:90"] |
+-------+-----------------------------------------+--+
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With