Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match exact phrase within a string in Python

I'm trying to determine whether a substring is in a string. The issue I'm running into is that I don't want my function to return True if the substring is found within another word in the string.

For example: if the substring is; "Purple cow" and the string is; "Purple cows make the best pets." This should return False. Since cow isn't plural in the substring.

And if the substring was; "Purple cow" and the string was; "Your purple cow trampled my hedge!" would return True

My code looks something like this:

def is_phrase_in(phrase, text):
    phrase = phrase.lower()
    text = text.lower()

    return phrase in text


text = "Purple cows make the best pets!"
phrase = "Purple cow"
print(is_phrase_in(phrase, text)

In my actual code I clean up unnecessary punctuation and spaces in 'text' before comparing it to phrase, but otherwise this is the same. I've tried using re.search, but I don't understand regular expressions very well yet and have only gotten the same functionality from them as in my example.

Thanks for any help you can provide!

like image 703
Jeremon Avatar asked Dec 06 '17 19:12

Jeremon


People also ask

How do you match an exact string in Python?

Exact match (equality comparison): == , != As with numbers, the == operator determines if two strings are equal. If they are equal, True is returned; if they are not, False is returned. It is case-sensitive, and the same applies to comparisons by other operators and methods.

How do you search for an exact word in Python?

import re with open('regex. txt', 'r') as a: word = "hello" for line in a: line = line. rstrip() if re.search(r"({})". format(word), line): print(f'{line} ->>>> match!


2 Answers

Since your phrase can have multiple words, doing a simple split and intersect won't work. I'd go with regex for this one:

import re

def is_phrase_in(phrase, text):
    return re.search(r"\b{}\b".format(phrase), text, re.IGNORECASE) is not None

phrase = "Purple cow"

print(is_phrase_in(phrase, "Purple cows make the best pets!"))   # False
print(is_phrase_in(phrase, "Your purple cow trampled my hedge!"))  # True
like image 161
zwer Avatar answered Sep 30 '22 04:09

zwer


Using PyParsing:

import pyparsing as pp

def is_phrase_in(phrase, text):
    phrase = phrase.lower()
    text = text.lower()

    rule = pp.ZeroOrMore(pp.Keyword(phrase))
    for t, s, e in rule.scanString(text):
      if t:
        return True
    return False

text = "Your purple cow trampled my hedge!"
phrase = "Purple cow"
print(is_phrase_in(phrase, text))

Which yields:

True
like image 33
Raphael Avatar answered Sep 30 '22 05:09

Raphael