Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filtering and counting negative/positive values from a Spark dataframe using pyspark?

I have no clue how to filter for positive or negative values within a column using pyspark, can you help?

I have a spark dataframe with 10MM+ rows and 50+ columns and need to count the times the values in one specific column are equal or less than 0.

Thanks in advance.

like image 599
Giordan Pretelin Avatar asked Oct 24 '25 16:10

Giordan Pretelin


1 Answers

For the column you want to target you can simply filter the dataframe for when the value is <= 0 and count the number of rows that meet the criteria.

import pyspark.sql.functions as func

df.filter(func.col("colname") <= 0).count()
like image 184
vielkind Avatar answered Oct 26 '25 10:10

vielkind



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!