Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to modify Levenshtein algorithm, to know if it inserted, deleted, or substituted a character?

So I am trying to devise a spin off of the Levenshtein algorithm, where I keep track of what transformations I did in the string(inserted a, or substitute a for b).

Example:

Basically, say I am computing the edit distance of "bbd" and "bcd"

The edit distance will be 1 and the transformation will be "substitude b for c"

Question: How would I approach this problem since the implementations i've seen do not concern themselves with knowing what kind of operation it was but only the total cost?

like image 889
Georgi Angelov Avatar asked Jan 11 '23 10:01

Georgi Angelov


1 Answers

You can use this module - there is an editops function there, which returns a list with the operations needed to transform one string to another.

Example:

Levenshtein.editops("FBBDE", "BCDASD")
[('delete', 0, 0), ('replace', 2, 1), ('insert', 4, 3), ('insert', 4, 4), ('replace', 4, 5)]

From the docs:

Find sequence of edit operations transforming one string to another.

editops(source_string, destination_string) editops(edit_operations, source_length, destination_length)

The result is a list of triples (operation, spos, dpos), where operation is one of 'equal', 'replace', 'insert', or 'delete'; spos and dpos are position of characters in the first (source) and the second (destination) strings. These are operations on single characters. In fact the returned list doesn't contain the 'equal', but all the related functions accept both lists with and without 'equal's.

like image 163
Ivan Genchev Avatar answered Jan 17 '23 07:01

Ivan Genchev