i want to count NULL, empty and NaN values in a column. I tried it like this: <pre class="prettyprint"><code>df.filter( (df["ID"] == "") | (df["ID"].isNull()) | ( df["ID"].isnan()) ).count() </code></pre> But i always get this error message: <pre class="prettyprint"><code>TypeError: 'Column' object is not callable </code></pre> Does anyone have an idea what might be the problem? Many thanks in advance!

<code>isnan</code> is not a method belonging to the <code>Column</code> class, you need to import it: <pre class="prettyprint"><code>from pyspark.sql.functions import isnan </code></pre> And use it like: <pre class="prettyprint"><code>df.filter((df["ID"] == "") | df["ID"].isNull() | isnan(df["ID"])).count() </code></pre>

Python / Pyspark - Count NULL, empty and NaN

Tags:

python

pyspark

i want to count NULL, empty and NaN values in a column. I tried it like this:

df.filter( (df["ID"] == "") | (df["ID"].isNull()) | ( df["ID"].isnan()) ).count()

But i always get this error message:

TypeError: 'Column' object is not callable

Does anyone have an idea what might be the problem?

Many thanks in advance!

584

asked Jan 12 '18 15:01

qwertz

1 Answers

isnan is not a method belonging to the Column class, you need to import it:

from pyspark.sql.functions import isnan

And use it like:

df.filter((df["ID"] == "") | df["ID"].isNull() | isnan(df["ID"])).count()

answered Oct 02 '22 22:10

Psidom

Related questions
                            
                                What is the meaning of arr[:] in assignment in numpy?
                            
                                Taking np.average while ignoring NaN's?
                            
                                Pass JavaScript variable to Flask url_for
                            
                                Reading a list stored in a text file [duplicate]
                            
                                How to check anaconda's version on mac?
                            
                                Python 3.5, ctypes: TypeError: bytes or integer address expected instead of str instance
                            
                                ENTER key press using Selenium WebDriver with python [duplicate]
                            
                                Get constraints in matrix format from gurobipy
                            
                                Flask Response vs Flask make_response
                            
                                python - matplotlib : figsize for subplots - adding space between rows
                            
                                ImportError: cannot import name TwilioRestClient
                            
                                How to normalize the volume of an audio file in python?
                            
                                Pandas to_dict() Returning "Timestamp"
                            
                                Fastest way to check whether a value exists more often than X in a list
                            
                                Qt Designer how to change background
                            
                                What does tensorflow "op" do?
                            
                                Selecting only numeric/string columns names from a Spark DF in pyspark
                            
                                How to handle DuplicateKeyError in MongoDB (pyMongo)?
                            
                                python draw parallelepiped
                            
                                From request import PandaRequest ImportError: No module named 'request'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With