I faced this question in an interview:
Given two regular expression, compute the edit distance between them. The edit distance being defined as the smallest edit distance between any two strings generated by the two regular expressions respectively.
Formally, we are looking for d(L1,L2) = min { d(x,y) | x from L1, y from L2 }
, where L1
and L2
are the languages generated by the two regular expressions.
I was not able to solve it during interviews. Even now I don't have any clue how to solve it. Any ideas?
I think this is same as http://www.spoj.com/problems/AMR10B/
Delete 'm'th character of str1 and compute edit distance between 'm-1' characters of str1 and 'n' characters of str2. For this computation, we simply have to do - (1 + array[m-1][n]) where 1 is the cost of delete operation and array[m-1][n] is edit distance between 'm-1' characters of str1 and 'n' characters of str2.
Edit Distance or Levenstein distance (the most common) is a metric to calculate the similarity between a pair of sequences. The distance between two sequences is measured as the number of edits (insertion, deletion, or substitution) that are required to convert one sequence to another.
Abstract: The normalized edit distance is one of the distances derived from the edit distance. It is useful in some applications because it takes into account the lengths of the two strings compared. The normalized edit distance is not defined in terms of edit operations but rather in terms of the edit path.
The main results presented here are: The problem of edit distance with moves is NP-complete. A “recursive” sequence of moves can be simulated with at most a constant factor increase by a non-recursive sequence.
There's finite state machines that represent the two languages. Let's say the first language has states S[1], S[2], ..., S[N1] and transitions c: S[i]->S[j] (meaning state i goes to state j under input character c), and T[1], T[2], ... T[N2] for the second language (with its own set of transitions).
Now, you can construct the weighted multi-graph with nodes being pairs of states, and edges between pairs (S[i1], T[i2]) -> (S[j1], T[j2]) if any of these four cases hold:
Then, finding the lowest weight path from the pair of start states to any pair of accepting states gives you the minimal edit distance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With