Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort by column in descending order in Spark SQL?

I tried df.orderBy("col1").show(10) but it sorted in ascending order. df.sort("col1").show(10) also sorts in ascending order. I looked on stackoverflow and the answers I found were all outdated or referred to RDDs. I'd like to use the native dataframe in spark.

like image 220
Vedom Avatar asked Oct 21 '22 13:10

Vedom


People also ask

How do you sort a column in descending order in PySpark?

The Default sorting technique used by order by is ASC. We can import the PySpark function and used the DESC method to sort the data frame in Descending order. We can sort the elements by passing the columns within the Data Frame, the sorting can be done with one column to multiple column.

How do I sort a column in Spark?

In Spark, you can use either sort() or orderBy() function of DataFrame/Dataset to sort by ascending or descending order based on single or multiple columns, you can also do sorting using Spark SQL sorting functions, In this article, I will explain all these different ways using Scala examples.

How do you sort a column in ascending order in PySpark?

We can use either orderBy() or sort() method to sort the data in the dataframe. Pass asc() to sort the data in ascending order; otherwise, desc(). We can do this based on a single column or multiple columns.

How do I order columns in Spark DataFrame?

In order to Rearrange or reorder the column in pyspark we will be using select function. To reorder the column in ascending order we will be using Sorted function. To reorder the column in descending order we will be using Sorted function with an argument reverse =True. We also rearrange the column by position.


1 Answers

You can also sort the column by importing the spark sql functions

import org.apache.spark.sql.functions._
df.orderBy(asc("col1"))

Or

import org.apache.spark.sql.functions._
df.sort(desc("col1"))

importing sqlContext.implicits._

import sqlContext.implicits._
df.orderBy($"col1".desc)

Or

import sqlContext.implicits._
df.sort($"col1".desc)
like image 252
Gabber Avatar answered Oct 24 '22 03:10

Gabber