understand df.isnull.mean() in python

Question

I have a dataframe df. Code is written in such a manner

df.isnull().mean().sort_values(ascending = False)

Here is the some part of the output-

inq_fi                                 1.0
sec_app_fico_range_low                 1.0

I want to understand how it is working?

if we use, df.isnull() only it will return us True or False for each and every cell. How mean() is going to give us right output. My objective is to find percentage of null values in all columns. Above output represents inq_fi and sec_app_fico_range_low has all missing values.

Also we are not passing by in sort_values?

zipa · Accepted Answer

Breakdown would look like this:

df.isnull()
#Mask all values that are NaN as True
df.isnull().mean()
#compute the mean of Boolean mask (True evaluates as 1 and False as 0)
df.isnull().mean().sort_values(ascending = False)
#sort the resulting series by column names descending

That being said a column that has values:

[np.nan, 2, 3, 4]

is evaluated as:

[True, False, False, False]

interpreted as:

[1, 0, 0, 0]

Resulting in:

0.25

understand df.isnull.mean() in python

Tags:

python

python-3.x

pandas

yashul

1 Answers

zipa

Recent Activity

Donate For Us

understand df.isnull.mean() in python

Tags:

python

python-3.x

pandas

yashul

1 Answers

zipa

Related questions

Recent Activity

Donate For Us