I want to perform filtering for Pandas Dataframe. The sample table is like below. For example, I want to filter(remove) all columns which contains the value 2.
C1 C2 C3 C4 C5
1 1 1 1 4
1 2 1 2 5
1 1 3 1 4
I want result table like this. (C2 and C4 removed)
C1 C3 C5
1 1 4
1 1 5
1 3 4
Also, I want to do this job on rows, using value 5.
C1 C3 C5
1 1 4
1 3 4
I can do it very easily for single column or row, like df = df[df.C2 !=2], but I don't have good idea for multiple or whole columns and rows. Is there some simple way for this?
You can select by loc with any and specify axis:
print df
C1 C2 C3 C4 C5
0 1 1 1 1 4
1 1 2 1 2 5
2 1 1 3 1 4
print ~(df == 2)
C1 C2 C3 C4 C5
0 True True True True True
1 True False True False True
2 True True True True True
df = df.loc[:, ~(df == 2).any(axis=0)]
print df
C1 C3 C5
0 1 1 4
1 1 1 5
2 1 3 4
df = df.loc[~(df == 5).any(axis=1)]
print df
C1 C3 C5
0 1 1 4
2 1 3 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With