I have been trying to implement tabular method for simplification of boolean expressions in python. For that I need to check whether two given strings differ at only one index for example, the function should return the following for the following examples:
0011
and 0111
- true as the two differ only at position 10-001
and 0-101
- true as differ at only 20-011
and 0-101
- false as differ at 2,3right now I am using the following function:
def match(s1,s2):
l=[False,-1]##returns false when they cant be combined
for i in range(len(s1)):
if s1[:i]==s2[:i] and s1[i]!=s2[i] and s1[i+1:]==s2[i+1:]:
l= [True,i]
break
return l
I want to implement it in a very fast manner (low complexity). Is there a way to do so in python?
I used the Levenshtein distance to match strings that are different by one character added or removed only (and not simply replaced). So:
are considered as a match.
The Levenshtein distance is the number of edits it would take to replace one string by another. The Levenshtein distance of the two above strings is 1, because the only thing to do is to remove one character.
To use it, you can simply install the according Levenshtein python module:
pip install python-Levenshtein
and then use it:
from Levenshtein import distance
def match(s1, s2):
return distance(s1, s2) <= 1
This is a more well-performing solution, coded in Python 3:
def match(s1, s2):
ok = False
for c1, c2 in zip(s1, s2):
if c1 != c2:
if ok:
return False
else:
ok = True
return ok
I do not checked for length difference because you said the two strings are equal, but for a more general approach I would add it.
If you need the position of the different character:
def match(s1, s2):
pos = -1
for i, (c1, c2) in enumerate(zip(s1, s2)):
if c1 != c2:
if pos != -1:
return -1
else:
pos = i
return pos
These are benchmarks performed with timeit, tested with match("0-001", "0-101"). I translated all solutions to py3 and removed length test.
Tests with a longer string:
Martijn Pieters' solution:
timeit.timeit('match("0-0016ub5j2oi06u30tj30g6790v3nug[hoyj39867i6gy9thvb05y4b896y3n098vty98thn98qg5y4n8ygnqp", "0-0016ub5j2oi06u30tj30g6790v3gug[hoyj39867i6gy9thvb05y4b896y3n098vty98thn98qg5y4n8ygnqp")', setup="""
def match(s1, s2):
combo = zip(s1, s2)
return any(c1 != c2 for c1, c2 in combo) and all(c1 == c2 for c1, c2 in combo)
""")
result: 32.82
My solution:
timeit.timeit('match("0-0016ub5j2oi06u30tj30g6790v3nug[hoyj39867i6gy9thvb05y4b896y3n098vty98thn98qg5y4n8ygnqp", "0-0016ub5j2oi06u30tj30g6790v3gug[hoyj39867i6gy9thvb05y4b896y3n098vty98thn98qg5y4n8ygnqp")', setup="""
def match(s1, s2):
ok = False
for c1, c2 in zip(s1, s2):
if c1 != c2:
if ok:
return False
else:
ok = True
return ok
""")
Result: 20.21
In Python 2, use future_builtins.zip()
(Python 2 and 3 compatible zip()
-as-iterator) (in Python 3 the built-in will do fine) to combine the two strings character by character, then use any()
and all()
to loop over the resulting iterator:
try:
from future_builtins import zip
except ImportError:
pass
def match(s1, s2):
if len(s1) != len(s2):
return False
combo = zip(s1, s2)
return any(c1 != c2 for c1, c2 in combo) and all(c1 == c2 for c1, c2 in combo)
This works because combo
is an iterator; it yields pairs of characters one by one on demand, and any()
and all()
only take as many pairs as needed to determine their outcome.
any()
stops iterating as soon as it finds two characters that are not equal. all()
will only return True if all the remaining characters are equal.
Together the two conditions then are only True if there is exactly one pair that differs.
Because an iterator approach is used, the above does the absolute minimum amount of work to determine if your strings are a match; the moment a second pair is found that doesn't match iteration stops; there is no point in looking at the rest of the character combinations.
Demo (Python 2, so zip()
is imported):
>>> from future_builtins import zip
>>> def match(s1, s2):
... if len(s1) != len(s2):
... return False
... combo = zip(s1, s2)
... return any(c1 != c2 for c1, c2 in combo) and all(c1 == c2 for c1, c2 in combo)
...
>>> match('0011', '0111')
True
>>> match('0-001', '0-101')
True
>>> match('0-011', '0-101')
False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With