How to test if a string contains one of the substrings in a list, in pandas?

People also ask

How do you check if a string contains a substring in pandas?

Using “contains” to Find a Substring in a Pandas DataFrame The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. A basic application of contains should look like Series. str. contains("substring") .

How do you check if a list of substrings is in a string Python?

The easiest way to check if a Python string contains a substring is to use the in operator. The in operator is used to check data structures for membership in Python. It returns a Boolean (either True or False ).

How do you check if a string contains a specific substring?

You can use contains(), indexOf() and lastIndexOf() method to check if one String contains another String in Java or not. If a String contains another String then it's known as a substring. The indexOf() method accepts a String and returns the starting position of the string if it exists, otherwise, it will return -1.

How do you check if a string contains a list of characters Python?

Using Python's "in" operator The simplest and fastest way to check whether a string contains a substring or not in Python is the "in" operator . This operator returns true if the string contains the characters, otherwise, it returns false .

One option is just to use the regex | character to try to match each of the substrings in the words in your Series s (still using str.contains).

You can construct the regex by joining the words in searchfor with |:

Click to copy

>>> searchfor = ['og', 'at']
>>> s[s.str.contains('|'.join(searchfor))]
0    cat
1    hat
2    dog
3    fog
dtype: object

As @AndyHayden noted in the comments below, take care if your substrings have special characters such as $ and ^ which you want to match literally. These characters have specific meanings in the context of regular expressions and will affect the matching.

You can make your list of substrings safer by escaping non-alphanumeric characters with re.escape:

Click to copy

>>> import re
>>> matches = ['$money', 'x^y']
>>> safe_matches = [re.escape(m) for m in matches]
>>> safe_matches
['\\$money', 'x\\^y']

The strings with in this new list will match each character literally when used with str.contains.

You can use str.contains alone with a regex pattern using OR (|):

Click to copy

s[s.str.contains('og|at')]

Or you could add the series to a dataframe then use str.contains:

Click to copy

df = pd.DataFrame(s)
df[s.str.contains('og|at')]

Output:

Click to copy

0 cat
1 hat
2 dog
3 fog

Here is a one line lambda that also works:

Click to copy

df["TrueFalse"] = df['col1'].apply(lambda x: 1 if any(i in x for i in searchfor) else 0)

Input:

Click to copy

searchfor = ['og', 'at']

df = pd.DataFrame([('cat', 1000.0), ('hat', 2000000.0), ('dog', 1000.0), ('fog', 330000.0),('pet', 330000.0)], columns=['col1', 'col2'])

   col1  col2
0   cat 1000.0
1   hat 2000000.0
2   dog 1000.0
3   fog 330000.0
4   pet 330000.0

Apply Lambda:

Click to copy

df["TrueFalse"] = df['col1'].apply(lambda x: 1 if any(i in x for i in searchfor) else 0)

Output:

Click to copy

    col1    col2        TrueFalse
0   cat     1000.0      1
1   hat     2000000.0   1
2   dog     1000.0      1
3   fog     330000.0    1
4   pet     330000.0    0

Related questions
                            
                                Adding information to an exception?
                            
                                Example use of "continue" statement in Python?
                            
                                How to add title to seaborn boxplot
                            
                                How to equalize the scales of x-axis and y-axis in matplotlib
                            
                                csv.Error: iterator should return strings, not bytes
                            
                                How to sort the letters in a string alphabetically in Python
                            
                                How to open every file in a folder
                            
                                pythonw.exe or python.exe?
                            
                                Format numbers in django templates
                            
                                What does a . in an import statement in Python mean?
                            
                                Django migration strategy for renaming a model and relationship fields
                            
                                What would a "frozen dict" be?
                            
                                Getting one value from a tuple
                            
                                How to use newline '\n' in f-string to format output in Python 3.6?
                            
                                How to document a method with parameter(s)?
                            
                                Read password from stdin
                            
                                How to extract the decision rules from scikit-learn decision-tree?
                            
                                Convert a namedtuple into a dictionary
                            
                                In Python how should I test if a variable is None, True or False
                            
                                Python Linked List

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to test if a string contains one of the substrings in a list, in pandas?

Tags:

python

string

pandas

dataframe

match

People also ask

Recent Activity

Donate For Us