Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Executing SQL Statements in spark-sql

I have a text file which is of the following format:

ID,Name,Rating
1,A,3
2,B,4
1,A,4

and I want to find the average rating for each ID in spark. This is the code I have so far but it keeps on giving me an error:

val Avg_data=spark.sql("select ID, AVG(Rating) from table")

ERROR: org.apache.sapk.sql.AnalysisException: grouping expressions sequence is empty, and 'table'.'ID' is not an aggregate function. Wrap '(avg(CAST(table.'Rating' AS BIGINT)) as 'avg(Rating)')' in windowing function(s).........

like image 808
Skyhopper9 Avatar asked May 30 '26 18:05

Skyhopper9


1 Answers

AVG() is an aggregation function so you would need a group by too

val Avg_data=spark.sql("select ID, AVG(Rating) as average from table group by ID")

You should have Avg_data as

+---+-------+
|ID |average|
+---+-------+
|1  |3.5    |
|2  |4.0    |
+---+-------+