Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add levenshtein function in mysql?

I got the code for Levenshtein distance for MySQL from http://kristiannissen.wordpress.com/2010/07/08/mysql-levenshtein/(archive.org link), but how to add that function in MySQL? I am using XAMPP and I need it for search in PHP.

like image 861
Sandesh Sharma Avatar asked Dec 17 '12 07:12

Sandesh Sharma


People also ask

How to use Levenshtein distance in mySQL?

The first step is to implement the levenshtein function in mySQL. Select the table where the distance between words is required and copy the following SQL code (from http://www.artfulsoftware.com): Run the query on your server. The function must appear in your current table.

What is use of levenshtein algorithm in PHP?

The levenshtein() function returns the Levenshtein distance between two strings. The Levenshtein distance is the number of characters you have to replace, insert or delete to transform string1 into string2. By default, PHP gives each operation (replace, insert, and delete) equal weight.

What is levenshtein ratio?

Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965.

What is levenshtein in Python?

Levenshtein distance is a lexical similarity measure which identifies the distance between one pair of strings. It does so by counting the number of times you would have to insert, delete or substitute a character from string 1 to make it like string 2.


2 Answers

I have connected to my MySQL server and simply executed this statement in MySQL Workbench, and it simply worked - I now have new function levenshtein().

For example, this works as expected:

SELECT levenshtein('abcde', 'abced')  2 
like image 102
mvp Avatar answered Sep 22 '22 21:09

mvp


DELIMITER $$ CREATE FUNCTION levenshtein( s1 VARCHAR(255), s2 VARCHAR(255) )  RETURNS INT  DETERMINISTIC  BEGIN  DECLARE s1_len, s2_len, i, j, c, c_temp, cost INT;  DECLARE s1_char CHAR;  -- max strlen=255  DECLARE cv0, cv1 VARBINARY(256);  SET s1_len = CHAR_LENGTH(s1), s2_len = CHAR_LENGTH(s2), cv1 = 0x00, j = 1, i = 1, c = 0;  IF s1 = s2 THEN    RETURN 0;  ELSEIF s1_len = 0 THEN    RETURN s2_len;  ELSEIF s2_len = 0 THEN    RETURN s1_len;  ELSE    WHILE j <= s2_len DO      SET cv1 = CONCAT(cv1, UNHEX(HEX(j))), j = j + 1;    END WHILE;    WHILE i <= s1_len DO      SET s1_char = SUBSTRING(s1, i, 1), c = i, cv0 = UNHEX(HEX(i)), j = 1;      WHILE j <= s2_len DO        SET c = c + 1;        IF s1_char = SUBSTRING(s2, j, 1) THEN           SET cost = 0; ELSE SET cost = 1;        END IF;        SET c_temp = CONV(HEX(SUBSTRING(cv1, j, 1)), 16, 10) + cost;        IF c > c_temp THEN SET c = c_temp; END IF;          SET c_temp = CONV(HEX(SUBSTRING(cv1, j+1, 1)), 16, 10) + 1;          IF c > c_temp THEN             SET c = c_temp;           END IF;          SET cv0 = CONCAT(cv0, UNHEX(HEX(c))), j = j + 1;      END WHILE;      SET cv1 = cv0, i = i + 1;    END WHILE;  END IF;  RETURN c;  END$$ DELIMITER ; 
like image 32
KEYAN TECH Avatar answered Sep 22 '22 21:09

KEYAN TECH