Spark SQL grouping: Add to group by or wrap in first() if you don't care which value you get.;

Question

I have a query in Spark SQL like

select count(ts), truncToHour(ts)
from myTable
group by truncToHour(ts).

Where ts is of timestamp type, truncToHour is a UDF that truncates the timestamp to hour. This query does not work. If I try,

select count(ts), ts from myTable group by truncToHour(ts)

I got expression 'ts' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() if you don't care which value you get.;, but first() is not defined if I do:

select count(ts), first(ts) from myTable group by truncToHour(ts)

Anyway to get what I wanted without using a subquery? Also, why does it say "wrap in first()" but the first() is not defined?

Kumar Deepak · Accepted Answer

https://issues.apache.org/jira/browse/SPARK-9210

Seems the actual function is first_value.

Spark SQL grouping: Add to group by or wrap in first() if you don't care which value you get.;

Tags:

sql

group-by

apache-spark

udf

Paul Z Wu

1 Answers

Kumar Deepak

Recent Activity

Donate For Us

Spark SQL grouping: Add to group by or wrap in first() if you don't care which value you get.;

Tags:

sql

group-by

apache-spark

udf

Paul Z Wu

1 Answers

Kumar Deepak

Related questions

Recent Activity

Donate For Us