Dataframe: how to groupBy/count then order by count in Scala

Tags:

scala

apache-spark

I have a dataframe that contains a thousands of rows, what I'm looking for is to group by and count a column and then order by the out put: what I did is somthing looks like :

import org.apache.spark.sql.hive.HiveContext
import sqlContext.implicits._


val objHive = new HiveContext(sc)
val df = objHive.sql("select * from db.tb")
val df_count=df.groupBy("id").count().collect()
df_count.sort($"count".asc).show()

487

asked Aug 07 '18 11:08

HISI

2 Answers

You can use sort or orderBy as below

val df_count = df.groupBy("id").count()

df_count.sort(desc("count")).show(false)

df_count.orderBy($"count".desc).show(false)

Don't use collect() since it brings the data to the driver as an Array.

Hope this helps!

143

answered Oct 27 '22 20:10

koiralo

//import the SparkSession which is the entry point for spark underlying API to access
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.functions._

 val pathOfFile="f:/alarms_files/"
//create session and hold it in spark variable
val spark=SparkSession.builder().appName("myApp").getOrCreate()
//read the file below API will return DataFrame of Row
var df=spark.read.format("csv").option("header","true").option("delimiter", "\t").load("file://"+pathOfFile+"db.tab")
//groupBY id column and take count of the column and order it by count of the column
    df=df.groupBy(df("id")).agg(count("*").as("columnCount")).orderBy("columnCount")
//for projecting the dataFrame it will show only top 20 records
    df.show
//for projecting more than 20 records  eg:
    df.show(50)

answered Oct 27 '22 19:10

Gagan Sp

Related questions
                            
                                Scala: add items to a sequence or merge sequences conditionally
                            
                                Generic Adder from Idris to Scala?
                            
                                Spark SQL - Generate array of arrays from the sql function
                            
                                Spark Scala: retrieve the schema and store it
                            
                                How to write a DataFrame schema to file in Scala
                            
                                How to do aggregation on multiple columns at once in Spark
                            
                                Enforcing non-emptyness of scala varargs at compile time
                            
                                Execute shell script from scala application
                            
                                What does "!" mean in scala?
                            
                                How to get max length of string column from dataframe using scala?
                            
                                com.typesafe.config.ConfigException$NotResolved: has not been resolved,
                            
                                Understanding Scala's underscore and asterisk magic [duplicate]
                            
                                How to convert/wrap a sequence in scala to an Option[Seq] so that if the list is empty, the Option is None
                            
                                Check if arraytype column contains null
                            
                                List of Kleisli to Kleisli of list
                            
                                How to get date from different year, month and day columns in spark (scala)
                            
                                Build Spark SQL query dynamically
                            
                                Error calling AWS Fargate task from AWS Lambda
                            
                                Coalesce columns in spark dataframe
                            
                                Repeat a function N times in Scala

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With