Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to order my tuple of spark results descending order using value

I am new to spark and scala. I need to order my result count tuple which is like (course, count) into descending order. I put like below

 val results = ratings.countByValue()
 val sortedResults = results.toSeq.sortBy(_._2)

But still its't working. In the above way it will sort the results by count with ascending order. But I need to have it in descending order. Can anybody please help me.

Results would be like below

(History, 12100),
(Music, 13200),
(Drama, 143000)

But I need to display it like below

(Drama, 143000),
(Music, 13200),
(History, 12100)

thanks

like image 991
Dilee Avatar asked Jan 29 '17 08:01

Dilee


People also ask

How do I sort in descending order in Spark SQL?

In Spark, we can use either sort() or orderBy() function of DataFrame/Dataset to sort by ascending or descending order based on single or multiple columns, you can also do sorting using Spark SQL sorting functions like asc_nulls_first(), asc_nulls_last(), desc_nulls_first(), desc_nulls_last().

How do I orderBy DESC in Spark?

In order to sort by descending order in Spark DataFrame, we can use desc property of the Column class or desc() sql function.

How do you sort RDD by value?

sortBy() is used to sort the data by value efficiently in pyspark. It is a method available in rdd. It uses a lambda expression to sort the data based on columns.

How do you sort data in a DataFrame PySpark?

We can use either orderBy() or sort() method to sort the data in the dataframe. Pass asc() to sort the data in ascending order; otherwise, desc(). We can do this based on a single column or multiple columns.


1 Answers

You can use

.sortWith(_._2 >_._2)

most of the time calling toSeq is not good idea because driver needs to put this in memory and you might run out of memory in on larger data sets. I guess this is o.k. for intro to spark.

like image 100
Marko Švaljek Avatar answered Nov 11 '22 14:11

Marko Švaljek