I am new to Spark and I am trying to make use of some aggregate features, like sum or avg. My query in spark-shell works perfectly:
val somestats = pf.groupBy("name").agg(sum("days")).show()
When I try to run it from scala project it is not working thou, throwing an error messageé
not found: value sum
I have tried to add
import sqlContext.implicits._
import org.apache.spark.SparkContext._
just before the command, but it does not help. My spark version is 1.4.1 Am I missing anything?
agg(Column expr, scala.collection.Seq<Column> exprs) Compute aggregates by specifying a series of aggregate columns.
Spark SQL is a Spark module for structured data processing. It provides a programming abstraction called DataFrames and can also act as a distributed SQL query engine. It enables unmodified Hadoop Hive queries to run up to 100x faster on existing deployments and data.
You need this import:
import org.apache.spark.sql.functions._
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With