Fuzzy string-matching that can "skip"? e.g. "i am (.*)." has 0 distance to "I am here."

Question

I'm writing a Python chatbot. No matter what the technique is(Levenshtein, LCS, regex, etc.), I want a pattern like My name is [ A ]. smart enough to match strings like:

My name is Tslmy.              #Distance should = 0, and groupdict()['a'] outputs "Tslmy"
My name is Tesla Tahomana.     #Distance should = 0(!), and groupdict()['a'] outputs "Tesla Tahomana"
my  naem ist tslmy .           #With a little typo, the distance = 5, and groupdict()['a'] outputs "tslmy "

Allow me to use groupdict()['a'] to refer to what the [ A ] thing (actually (?P<identifier>match)) has captured, please.

In other way, I'm looking for a "Levenshtein" with omits/skippings/blanks/neglects, and pick out what has been skipped as well.
In another way, I'm looking for a fuzzy(a.k.a. approximate) regex that can be less strict with the pattern, still provides the good old groupdict(), as well as a "fuzziness" value (or "edit distance", required to determine "the best matched pattern to the string" later).
This is the preferred solution, since it provides "sufficient" groupdict() if well managed.
However, The TRE library and the REGEX library, which is found to be the closest solution, don't seem to provide a "fuzziness" value. If this can be solved, then so much the better!

Is that possible? Thanks for paying attention.

Update:

I decided to use the powerful regex module in the end, but still unable to get the "fuzziness value".

Since the question on this page is theoratically solved, appending too further will be dishonorable. So I put forward another question about this new issue, and hopes you could solve it!

joel.d · Accepted Answer

You could use a RegEx for the basic match:

r"My name is (\w+){1,2}."

And then use the TRE library to allow for variations.

Fuzzy string-matching that can "skip"? e.g. "i am (.*)." has 0 distance to "I am here."

Tags:

regex

levenshtein-distance

fuzzy-search

tslmy

1 Answers

joel.d

Recent Activity

Donate For Us

Fuzzy string-matching that can "skip"? e.g. "i am (.*)." has 0 distance to "I am here."

Tags:

regex

levenshtein-distance

fuzzy-search

tslmy

1 Answers

joel.d

Related questions

Recent Activity

Donate For Us