I have data like the data below. I would like to only return the columns from the dataframe that contain at least one non-zero value. So in the example below it would be column ALF. Returning non-zero rows doesn’t seem that tricky but selecting the column and records is giving me a little trouble.
print df
Data:
Type ADR ALE ALF AME
Seg0 0.0 0.0 0.0 0.0
Seg1 0.0 0.0 0.5 0.0
When I try something like the link below:
Pandas: How to select columns with non-zero value in a sparse table
m1 = (df['Type'] == 'Seg0')
m2 = (df[m1] != 0).all()
print (df.loc[m1,m2])
I get a key error for 'Type'
nonzero() is an argument less method. Just like it name says, rather returning non zero values from a series, it returns index of all non zero values. The returned series of indices can be passed to iloc method and return all non zero values.
count_nonzero. Counts the number of non-zero values in the array a . The word “non-zero” is in reference to the Python 2.
If you have a DataFrame and would like to access or select a specific few rows/columns from that DataFrame, you can use square brackets or other advanced methods such as loc and iloc .
DataFrame. mean() function is used to get the mean of the values over the requested axis in pandas. This by default returns a Series, if level specified, it returns a DataFrame. By default ignore NaN values and performs mean on index axis.
In my opinion you get key error
because first column is index
:
Solution use DataFrame.any
for check at least one non zero value to mask and then filter index of True
s:
m2 = (df != 0).any()
a = m2.index[m2]
print (a)
Index(['ALF'], dtype='object')
Or if need list
:
a = m2.index[m2].tolist()
print (a)
['ALF']
Similar solution is filter columns names:
a = df.columns[m2]
Detail:
print (m2)
ADR False
ALE False
ALF True
AME False
dtype: bool
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With