I'm trying to match rows of a Pandas DataFrame that contains and doesn't contain certain strings. For example: <pre class="prettyprint"><code>import pandas df = pandas.Series(['ab1', 'ab2', 'b2', 'c3']) df[df.str.contains("b")] </code></pre> Output: <pre class="prettyprint"><code>0 ab1 1 ab2 2 b2 dtype: object </code></pre> Desired output: <pre class="prettyprint"><code>2 b2 dtype: object </code></pre> Question: is there an elegant way of saying something like this? <pre class="prettyprint"><code>df[[df.str.contains("b")==True] and [df.str.contains("a")==False]] # Doesn't give desired outcome </code></pre>

You can use .loc and ~ to index: <pre class="prettyprint"><code>df.loc[(df.str.contains("b")) & (~df.str.contains("a"))] 2 b2 dtype: object </code></pre>

Either: <pre class="prettyprint"><code>>>> ts.str.contains('b') & ~ts.str.contains('a') 0 False 1 False 2 True 3 False dtype: bool </code></pre> or use regex: <pre class="prettyprint"><code>>>> ts.str.contains('^[^a]*b[^a]*$') 0 False 1 False 2 True 3 False dtype: bool </code></pre>

Python Pandas: String Contains and Doesn't Contain

Tags:

python

pandas

dataframe

I'm trying to match rows of a Pandas DataFrame that contains and doesn't contain certain strings. For example:

import pandas
df = pandas.Series(['ab1', 'ab2', 'b2', 'c3'])
df[df.str.contains("b")]

Output:

0    ab1
1    ab2
2     b2
dtype: object

Desired output:

2     b2
dtype: object

Question: is there an elegant way of saying something like this?

df[[df.str.contains("b")==True] and [df.str.contains("a")==False]]
# Doesn't give desired outcome

729

asked Dec 03 '15 00:12

Sam Perry

3 Answers

You're almost there, you just haven't got the syntax quite right, it should be:

df[(df.str.contains("b") == True) & (df.str.contains("a") == False)]

Another approach which might be cleaner if you have a lot of conditions to apply would to be to chain your filters together with reduce or a loop:

from functools import reduce
filters = [("a", False), ("b", True)]
reduce(lambda df, f: df[df.str.contains(f[0]) == f[1]], filters, df)
#outputs b2

131

answered Oct 05 '22 05:10

maxymoo

You can use .loc and ~ to index:

df.loc[(df.str.contains("b")) & (~df.str.contains("a"))]

2    b2
dtype: object

answered Oct 05 '22 03:10

lstodd

Either:

>>> ts.str.contains('b') & ~ts.str.contains('a')
0    False
1    False
2     True
3    False
dtype: bool

or use regex:

>>> ts.str.contains('^[^a]*b[^a]*$')
0    False
1    False
2     True
3    False
dtype: bool

answered Oct 05 '22 05:10

behzad.nouri

Related questions
                            
                                Polling a stopping or starting EC2 instance with Boto
                            
                                Output 50 samples closest to each cluster center using scikit-learn.k-means library
                            
                                What is the meaning of string argument in django model's Field?
                            
                                Django Rest Framework 3.0 to_representation not implemented
                            
                                Python3.4 can't install mysql-python
                            
                                Get the id of the object recently created Django Rest Framework
                            
                                TypeError: Type str doesn't support the buffer API when splitting string
                            
                                How do I create a pie chart using Bokeh?
                            
                                Selenium/PhantomJS raises error
                            
                                Error importing Polygon from shapely.geometry.polygon
                            
                                How to get test cases list in Robot Framework without launching the actual tests?
                            
                                Extracting a dictionary from an RDD in Pyspark
                            
                                How to enable CORS on Google App Engine Python Server?
                            
                                Python - Reading Emoji Unicode Characters
                            
                                Extract sender's email address from Outlook Exchange in Python using win32
                            
                                Django - custom 403 template
                            
                                get_xticklabels() contains empty text instances
                            
                                Is it possible to align a print statement to the center in Python?
                            
                                passing bash array to python list
                            
                                argrelextrema and flat extrema

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With