Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark SQL grouping: Add to group by or wrap in first() if you don't care which value you get.;

I have a query in Spark SQL like

select count(ts), truncToHour(ts)
from myTable
group by truncToHour(ts).

Where ts is of timestamp type, truncToHour is a UDF that truncates the timestamp to hour. This query does not work. If I try,

select count(ts), ts from myTable group by truncToHour(ts)

I got expression 'ts' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() if you don't care which value you get.;, but first() is not defined if I do:

select count(ts), first(ts) from myTable group by truncToHour(ts)

Anyway to get what I wanted without using a subquery? Also, why does it say "wrap in first()" but the first() is not defined?

like image 477
Paul Z Wu Avatar asked Jul 09 '15 22:07

Paul Z Wu


1 Answers

https://issues.apache.org/jira/browse/SPARK-9210

Seems the actual function is first_value.

like image 200
Kumar Deepak Avatar answered Oct 03 '22 08:10

Kumar Deepak