I am looking to find if two different strings are present in a row of a dataframe.
For example, I currently have this code which provides answers with item a OR b.
items=('a|b')
df1 = train[train['antecedents'].str.contains(items,flags=re.IGNORECASE, regex=True)]
As helpful as this is, I am looking to find all rows that have item a AND b.
Because I can't use multiple str.contains (as the number of items aren't specified until inputted into the items variable), I don't know how to incorporate the '&' into str.contains (I've tried and it doesn't work).
Is there possibly a different way to incorporate the '&' ?
Just combine 2 conditions with & operator:
df1 = train[(train.antecedents.str.contains('a', case=False)) \
& (train.antecedents.str.contains('b', case=False))]
Regex alternative:
df1 = train[train.antecedents.str.contains('a.*b|b.*a', regex=True, flags=re.I)]
a.*b|b.*a - regex alternation group, ensures that the input string contains both a and b in any position (relative to one another).If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With