I have a tsv file as follows.
id ingredients recipe
code1 egg, butter beat eggs. add unsalted butter
code2 tim tam, butter beat tim tam. add butter
code3 coffee, sugar add coffee and sugar and mix
code4 sugar, fresh goat milk beat sugar and milk together
I want to remove the entries if they contain the below mentioned words in either ingredients or recipe column.
mylist = ['tim tam', 'unsalted butter', 'fresh goat milk']
My output should look as follows.
id ingredients recipe
code3 coffee, sugar add coffee and sugar and mix
Is there a way to do this using pandas? Please help me!
Use contains with join to look to see if string contains a "sub" string, and join base with '|' to make a regex:
mylist = ['tim tam','unsalted butter','fresh goat milk']
df[~(df.ingredients.str.contains('|'.join(mylist)) |
df.recipe.str.contains('|'.join(mylist)))]
Output:
id ingredients recipe
2 code3 coffee, sugar add coffee and sugar and mix
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With