Count number of columns with some values for each row in pandas

Tags:

I have dataframe like this, data:

Site code    Col1  Col2  Col3
A5252        24    53     NaN
A5636        36    NaN    NaN
A4366        NaN   NaN    NaN
A7578        42    785    24

And I want to count a number of columns with some value, but none NaN. Desired output:

 Site code   Col1  Col2  Col3  Count
    A5252     24    53     NaN    2
    A5636     36    NaN    NaN    1
    A4366     NaN   NaN    NaN    0
    A7578     42    785    24     3

Something oposite to this: df = data.isnull().sum(axis=1)

661

asked Jun 23 '17 08:06

jovicbg

2 Answers

Need change isnull to notnull:

#if first columns is not index, set it
data = data.set_index('Site code')
data['Count'] = data.notnull().sum(axis=1)

Or use function DataFrame.count:

data = data.set_index('Site code')
data['Count'] = data.count(axis=1)
print (data)
           Col1   Col2  Col3  Count
Site code                          
A5252      24.0   53.0   NaN      2
A5636      36.0    NaN   NaN      1
A4366       NaN    NaN   NaN      0
A7578      42.0  785.0  24.0      3

Another solution with selecting columns by loc (Site code is column, not index):

print (data.loc[:, 'Col1':])
   Col1   Col2  Col3
0  24.0   53.0   NaN
1  36.0    NaN   NaN
2   NaN    NaN   NaN
3  42.0  785.0  24.0

data['Count'] = data.loc[:, 'Col1':].count(axis=1)
print (data)
  Site code  Col1   Col2  Col3  Count
0     A5252  24.0   53.0   NaN      2
1     A5636  36.0    NaN   NaN      1
2     A4366   NaN    NaN   NaN      0
3     A7578  42.0  785.0  24.0      3

Another nice idea from Jon Clements - use filter:

data['Count'] = data.filter(regex="^Col").count(axis=1)
print (data)

  Site code  Col1   Col2  Col3  Count
0     A5252  24.0   53.0   NaN      2
1     A5636  36.0    NaN   NaN      1
2     A4366   NaN    NaN   NaN      0
3     A7578  42.0  785.0  24.0      3

answered Oct 21 '22 14:10

jezrael

Simple use notnull()

import pandas as pd
df = pd.read_csv("your_csv.csv")

df['count'] = df.notnull().sum(axis=1)

print(df)

Also to add a column to a dataframe just use:

df['new_column_name'] = newcolumn

output:

Site code   Col1  Col 2  Col3  count
    A5252     24    53     NaN    2
    A5636     36    NaN    NaN    1
    A4366     NaN   NaN    NaN    0
    A7578     42    785    24     3

answered Oct 21 '22 13:10

void

Related questions
                            
                                ImportError: No module named 'speech_recognition' in python IDLE
                            
                                Getting black plots with plt.imshow after multiplying RGB image array by a scalar
                            
                                Unable to use summary.merge in tensorboard for separate training and evaluation summaries
                            
                                Python Installation Compilation Errors
                            
                                How can I keep cells square in heatmap?
                            
                                TypeError: 'module' object is not subscriptable
                            
                                Center crop a numpy array
                            
                                How to change values of url query in python?
                            
                                Multiple comparison operators in single statement (chaining comparison operators)
                            
                                How to import Bokeh palettes
                            
                                ImportError: libgomp.so.1: cannot open shared object file: No such file or directory
                            
                                Displaying both sides of a ManyToMany relationship in Django admin
                            
                                Can't pip install packages in python 3.6 due to ssl error
                            
                                Is there a method in numpy to multiply every element in an array?
                            
                                Speeding up an .exe created with Pyinstaller
                            
                                pandas to_latex() escapes mathmode
                            
                                Multiprocessing - map over list, killing processes that stall above timeout limit
                            
                                Python script runs on boot then reboots at end - How to regain control?
                            
                                OSError: Unable to locate Ghostscript on paths
                            
                                Select columns using pandas dataframe.query()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Count number of columns with some values for each row in pandas

Tags:

python

pandas

dataframe

jovicbg

People also ask

2 Answers

jezrael

void

Recent Activity

Donate For Us