In this data frame I am finding total salary from each group. In Oracle I'd use this code
select job_id,sum(salary) as "Total" from hr.employees group by job_id;
In Spark SQL tried the same, I am facing two issues
empData.groupBy($"job_id").sum("salary").alias("Total").show()
I could not use $ (I think Scala SQL syntax). Getting compilation issue
empData.groupBy($"job_id").sum($"salary").alias("Total").show()
Any idea?
Use Aggregate function .agg() if you want to provide alias name. This accepts scala syntax ($" ")
empData.groupBy($"job_id").agg(sum($"salary") as "Total").show()
If you dont want to use .agg(), alias name can be also be provided using .select():
empData.groupBy($"job_id").sum("salary").select($"job_id", $"sum(salary)".alias("Total")).show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With