I want to drop rows where any column contains one of the keywords
keywords=['Nokia' , 'Asus']
data = [['Nokia', 'AB123','broken'], ['iPhone', 'DF747','battery'], ['Acer', 'KH298','exchanged for a nokia'], ['Blackberry', 'jj091','exchanged for a Asus']]
df = pd.DataFrame(data, columns = ['Brand', 'ID', 'Description'])
df before:
Brand | ID | Description
----------------------------------------
Nokia | AB123 | broken
iPhone | DF747 | battery
Acer | KH298 | exchanged for a nokia
Blackberry | jj091 | exchanged for a Asus
df after:
Brand | ID | Description
----------------------------------------
iPhone | DF747 | battery
Acer | KH298 | exchanged for a nokia
How can i achieve this?
To delete rows that contain these cells, right-click anywhere in the data range and from the drop-down menu, choose Delete.
You can join all columns together with +
or apply
and then create mask by Series.str.contains
with joined values by |
for regex OR
:
df = df[~(df['Brand']+df['ID']+df['Description']).str.contains('|'.join(keywords))]
Or:
df = df[~df.apply(' '.join, 1).str.contains('|'.join(keywords))]
print (df)
Brand ID Description
1 iPhone DF747 battery
2 Acer KH298 exchanged for a nokia
If need case not sensitive add case
paremeter:
df = df[~df.apply(' '.join, 1).str.contains('|'.join(keywords), case=False)]
print (df)
Brand ID Description
1 iPhone DF747 battery
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With