I am doing analysis by pandas.
My table has 7M rows* 30 columns. Cell values are ranged from -1 to 3 randomly. Now I want to filter out rows based on columns' value.
I understand how to select based on multiple conditions, write down conditions and combine by "&" "|".
But I have 30 columns to filter and filter by the same value. For instance, last 12 columns' value equals -1 need to be selected
df.iloc[:,-12:]==-1
The code above gives me a boolean. I need actual data frame.
The logic here is "or", means if any column has value -1, that row needs to be selected.
Also, it is good to know what if I need "and", all columns have value -1?
Many thanks
For the OR
case, use DF.any
(returns True
if any element is True
along a particular axis):
df[(df.iloc[:,-12:] == -1).any(axis=1)]
For the AND
case, use DF.all
(returns True
if all elements are True
along a particular axis):
df[(df.iloc[:,-12:] == -1).all(axis=1)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With