Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate function in spark-sql not found

I am new to Spark and I am trying to make use of some aggregate features, like sum or avg. My query in spark-shell works perfectly:

val somestats = pf.groupBy("name").agg(sum("days")).show()

When I try to run it from scala project it is not working thou, throwing an error messageé

not found: value sum

I have tried to add

import sqlContext.implicits._
import org.apache.spark.SparkContext._

just before the command, but it does not help. My spark version is 1.4.1 Am I missing anything?

like image 656
TheMP Avatar asked Jul 24 '15 13:07

TheMP


People also ask

What is AGG () in spark?

agg(Column expr, scala.collection.Seq<Column> exprs) Compute aggregates by specifying a series of aggregate columns.

What does spark SQL () do?

Spark SQL is a Spark module for structured data processing. It provides a programming abstraction called DataFrames and can also act as a distributed SQL query engine. It enables unmodified Hadoop Hive queries to run up to 100x faster on existing deployments and data.


1 Answers

You need this import:

import org.apache.spark.sql.functions._
like image 195
Justin Pihony Avatar answered Sep 24 '22 19:09

Justin Pihony