I'm using pyspark(Python 2.7.9/Spark 1.3.1) and have a dataframe GroupObject which I need to filter & sort in the descending order. Trying to achieve it via this piece of code. <pre class="prettyprint"><code>group_by_dataframe.count().filter("`count` >= 10").sort('count', ascending=False) </code></pre> But it throws the following error. <pre class="prettyprint"><code>sort() got an unexpected keyword argument 'ascending' </code></pre>

In PySpark 1.3 <code>sort</code> method doesn't take ascending parameter. You can use <code>desc</code> method instead: <pre class="prettyprint"><code>from pyspark.sql.functions import col (group_by_dataframe .count() .filter("`count` >= 10") .sort(col("count").desc())) </code></pre> or <code>desc</code> function: <pre class="prettyprint"><code>from pyspark.sql.functions import desc (group_by_dataframe .count() .filter("`count` >= 10") .sort(desc("count")) </code></pre> Both methods can be used with with Spark >= 1.3 (including Spark 2.x).

Spark DataFrame groupBy and sort in the descending order (pyspark)

group_by_dataframe.count().filter("`count` >= 10").sort('count', ascending=False)

But it throws the following error.

sort() got an unexpected keyword argument 'ascending'

670

asked Dec 29 '15 15:12

rclakmal

1 Answers

In PySpark 1.3 sort method doesn't take ascending parameter. You can use desc method instead:

from pyspark.sql.functions import col  (group_by_dataframe     .count()     .filter("`count` >= 10")     .sort(col("count").desc()))

or desc function:

from pyspark.sql.functions import desc  (group_by_dataframe     .count()     .filter("`count` >= 10")     .sort(desc("count"))

Both methods can be used with with Spark >= 1.3 (including Spark 2.x).

144

answered Oct 12 '22 21:10

zero323

Related questions
                            
                                Inheritance and init method in Python
                            
                                Copy multiple files in Python
                            
                                How to determine whether a substring is in a different string [duplicate]
                            
                                Call Python script from bash with argument
                            
                                What does hash do in python?
                            
                                Construct pandas DataFrame from items in nested dictionary
                            
                                Pandas split DataFrame by column value
                            
                                What is the id( ) function used for?
                            
                                How to calculate probability in a normal distribution given mean & standard deviation?
                            
                                Add Text on Image using PIL
                            
                                Print new output on same line [duplicate]
                            
                                bash: mkvirtualenv: command not found
                            
                                How to determine whether a column/variable is numeric or not in Pandas/NumPy?
                            
                                how to "reimport" module to python then code be changed after import
                            
                                Permission denied when activating venv
                            
                                How to round the minute of a datetime object
                            
                                How to get text with Selenium WebDriver in Python
                            
                                how to split an iterable in constant-size chunks
                            
                                enumerate() for dictionary in Python
                            
                                Java: Equivalent of Python's range(int, int)?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark DataFrame groupBy and sort in the descending order (pyspark)

Tags:

python

dataframe

apache-spark

apache-spark-sql

pyspark

rclakmal

People also ask

1 Answers

zero323

Recent Activity

Donate For Us