pandas dataframe str.contains() AND operation

Tags:

I have a df (Pandas Dataframe) with three rows:

some_col_name
"apple is delicious"
"banana is delicious"
"apple and banana both are delicious"

The function df.col_name.str.contains("apple|banana") will catch all of the rows:

"apple is delicious",
"banana is delicious",
"apple and banana both are delicious".

How do I apply AND operator to the str.contains() method, so that it only grabs strings that contain BOTH "apple" & "banana"?

"apple and banana both are delicious"

I'd like to grab strings that contains 10-20 different words (grape, watermelon, berry, orange, ..., etc.)

578

asked May 03 '16 18:05

aerin

4 Answers

You can do that as follows:

df[(df['col_name'].str.contains('apple')) & (df['col_name'].str.contains('banana'))]

132

answered Oct 13 '22 04:10

flyingmeatball

You can also do it in regex expression style:

df[df['col_name'].str.contains(r'^(?=.*apple)(?=.*banana)')]

You can then, build your list of words into a regex string like so:

base = r'^{}'
expr = '(?=.*{})'
words = ['apple', 'banana', 'cat']  # example
base.format(''.join(expr.format(w) for w in words))

will render:

'^(?=.*apple)(?=.*banana)(?=.*cat)'

Then you can do your stuff dynamically.

answered Oct 13 '22 04:10

Anzel

df = pd.DataFrame({'col': ["apple is delicious",
                           "banana is delicious",
                           "apple and banana both are delicious"]})

targets = ['apple', 'banana']

# Any word from `targets` are present in sentence.
>>> df.col.apply(lambda sentence: any(word in sentence for word in targets))
0    True
1    True
2    True
Name: col, dtype: bool

# All words from `targets` are present in sentence.
>>> df.col.apply(lambda sentence: all(word in sentence for word in targets))
0    False
1    False
2     True
Name: col, dtype: bool

answered Oct 13 '22 04:10

Alexander

This works

df.col.str.contains(r'(?=.*apple)(?=.*banana)',regex=True)

answered Oct 13 '22 03:10

Charan Reddy

Related questions
                            
                                change multiple columns in pandas dataframe to datetime
                            
                                Split cell into multiple rows in pandas dataframe
                            
                                How to print variables without spaces between values [duplicate]
                            
                                Django Python rest framework, No 'Access-Control-Allow-Origin' header is present on the requested resource in chrome, works in firefox
                            
                                "This package should not be accessible on Python 3" when running python3
                            
                                Unable to install Python without sudo access
                            
                                .write not working in Python
                            
                                Querystring Array Parameters in Python using Requests
                            
                                How to pass a list as an environment variable?
                            
                                Why can I not import Tensorflow.contrib I get an error of No module named 'tensorflow.python.saved
                            
                                Django staticfiles app help
                            
                                Is there a filter for divide for Django Template?
                            
                                Selenium Webdriver in Python - files download directory change in Chrome preferences
                            
                                How can I get the first two digits of a number?
                            
                                module 'sklearn' has no attribute 'cross_validation'
                            
                                What is the range of values a float can have in Python?
                            
                                python human readable large numbers [duplicate]
                            
                                Python Pandas GroupBy get list of groups
                            
                                When to use "while" or "for" in Python
                            
                                Run local python script on remote server

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas dataframe str.contains() AND operation

Tags:

python

string

pandas

dataframe

aerin

People also ask

4 Answers

flyingmeatball

Anzel

Alexander

Charan Reddy

Recent Activity

Donate For Us