Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

select column with non-zero values from dataframe

I have data like the data below. I would like to only return the columns from the dataframe that contain at least one non-zero value. So in the example below it would be column ALF. Returning non-zero rows doesn’t seem that tricky but selecting the column and records is giving me a little trouble.

print df

Data:

Type             ADR             ALE     ALF               AME  
Seg0              0.0            0.0     0.0              0.0   
Seg1              0.0            0.0     0.5              0.0 

When I try something like the link below:

Pandas: How to select columns with non-zero value in a sparse table

m1 = (df['Type'] == 'Seg0')
m2 = (df[m1] != 0).all()

print (df.loc[m1,m2])

I get a key error for 'Type'

like image 289
user3476463 Avatar asked Apr 11 '18 16:04

user3476463


People also ask

How do you get non zero values in pandas?

nonzero() is an argument less method. Just like it name says, rather returning non zero values from a series, it returns index of all non zero values. The returned series of indices can be passed to iloc method and return all non zero values.

Which function will be used to count non zero values in a DataFrame?

count_nonzero. Counts the number of non-zero values in the array a . The word “non-zero” is in reference to the Python 2.

How do I select only certain columns in a DataFrame?

If you have a DataFrame and would like to access or select a specific few rows/columns from that DataFrame, you can use square brackets or other advanced methods such as loc and iloc .

Does DF mean ignore NaN?

DataFrame. mean() function is used to get the mean of the values over the requested axis in pandas. This by default returns a Series, if level specified, it returns a DataFrame. By default ignore NaN values and performs mean on index axis.


1 Answers

In my opinion you get key error because first column is index:

Solution use DataFrame.any for check at least one non zero value to mask and then filter index of Trues:

m2 = (df != 0).any()
a = m2.index[m2]
print (a)
Index(['ALF'], dtype='object')

Or if need list:

a = m2.index[m2].tolist()
print (a)
['ALF']

Similar solution is filter columns names:

a = df.columns[m2]

Detail:

print (m2)
ADR    False
ALE    False
ALF     True
AME    False
dtype: bool
like image 72
jezrael Avatar answered Oct 22 '22 22:10

jezrael