I got a DataFrame with lots of columns. Now I have a condition that tests some of those columns if any of that column-set is different to zero.
Is there any more elegant way to apply that condition to a subset of columns? My current code is:
df['indicator'] = (
(df['col_1'] != 0) |
(df['col_2'] != 0) |
(df['col_3'] != 0) |
(df['col_4'] != 0) |
(df['col_5'] != 0)
)
I was looking for something like this pseudo code:
columns = ['col_1', 'col_1', 'col_2', 'col_3', 'col_4', 'col_5']
df['indicator'] = df.any(columns, lambda value: value != 0)
ne
is the method form of !=
. I use that so that pipelining any
looks nicer. I use any(axis=1)
to find if any are true in a row.
df['indicator'] = df[columns].ne(0).any(axis=1)
In this particular case you could also check whether the sum of corresponding columns !=0
:
df['indicator'] = df[columns].prod(axis=1).ne(0)
PS @piRSquared's solution is much more generic...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With