Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keep only non zero missing values in pandas dataframe

Tags:

python

pandas

How to select only the non-null columns in descending order from the dataframe.

Here is the dataframe:

df = pd.DataFrame( { 'a': [1,2,np.nan,np.nan],
                    'b':  [10,20,30,40],
                   'c': [1,np.nan,np.nan,np.nan]})
     a   b    c
0  1.0  10  1.0
1  2.0  20  NaN
2  NaN  30  NaN
3  NaN  40  NaN

I can do this:

df.isnull().sum().sort_values(ascending=False)
c    3
a    2
b    0

But I want to CHAIN multiple commands to a single line so that it gives result in one line.

I tried: df.isnull().sum().sort_values(ascending=False).filter(lambda x: x>0) it fails

I know this:

temp = df.isnull().sum().sort_values(ascending=False)
temp[temp>0]
c    3
a    2

But I am looking way to chaining continuation in ONE-LINE.

Required:

df.isnull().sum().sort_values(ascending=False).somefunction( x > 0)

Update
I found a way converting series to dataframe and then using query.

df.isnull().sum().sort_values(ascending=False).to_frame().rename(columns={0:'temp'}).query("temp > 0")

This looks long and superfluous. Is there a better way ?

like image 344
BhishanPoudel Avatar asked Feb 17 '26 18:02

BhishanPoudel


1 Answers

That is confused for filter , since it is work for index , not the value

df.isnull().sum().loc[lambda x : x>0].sort_values(ascending=False)
Out[147]: 
a    2
c    3
dtype: int64
like image 98
BENY Avatar answered Feb 19 '26 13:02

BENY



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!