Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort by count with groupby in dataframe spark

Tags:

python

pyspark

I want to sort this count column by descending but I keep getting a 'NoneType' object is not callable error. How can I add a sort function to this so I won't get the error?

from pyspark.sql.functions import hour
hour = checkin.groupBy(hour("date").alias("hour")).count().show()

enter image description here


1 Answers

.show is returning None which you can't chain any dataframe method after. Remove it and use orderBy to sort the result dataframe:

from pyspark.sql.functions import hour, col
hour = checkin.groupBy(hour("date").alias("hour")).count().orderBy(col('count').desc())

Or:

from pyspark.sql.functions import hour, desc
checkin.groupBy(hour("date").alias("hour")).count().orderBy(desc('count')).show()
like image 63
Psidom Avatar answered Nov 02 '25 04:11

Psidom



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!