Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter columns of only zeros from a Pandas data frame

Tags:

python

pandas

I have a Pandas Data Frame where I would like to filter out all columns which only contain zeros. For example, in the Data Frame below, I'd like to remove column 2:

        0      1      2      3      4
0   0.381  0.794  0.000  0.964  0.304
1   0.538  0.029  0.000  0.327  0.928
2   0.041  0.312  0.000  0.208  0.284
3   0.406  0.786  0.000  0.334  0.118
4   0.511  0.166  0.000  0.181  0.980

How can I do this? I've been trying something like this:

df.filter(lambda x: x == 0)
like image 221
turtle Avatar asked Sep 13 '12 17:09

turtle


1 Answers

The following works for me. It gives a series where column names are now the index, and the value for an index is True/False depending on whether all items in the column are 0.

import pandas, numpy as np
# Create DataFrame "df" like yours...

df.apply(lambda x: np.all(x == 0))

And if you want to actually filter out the 0 values:

df[df.columns[(df != 0).any()]]
like image 185
ely Avatar answered Nov 12 '22 00:11

ely