Say I have a pyspark dataframe:
df.show()
+-----+---+
| x | y|
+-----+---+
|alpha| 1|
|beta | 2|
|gamma| 1|
|alpha| 2|
+-----+---+
I want to count how many occurrence alpha
, beta
and gamma
there are in column x
. How do I do this in pyspark?
Use pyspark.sql.DataFrame.cube()
:
df.cube("x").count().show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With