Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast Levenshtein distance in R?

Is there a package that contains Levenshtein distance counting function which is implemented as a C or Fortran code? I have many strings to compare and stringMatch from MiscPsycho is too slow for this.

like image 259
mbq Avatar asked Jul 05 '10 20:07

mbq


People also ask

How does R calculate Levenshtein distance?

To calculate the Levenshtein distance between two vectors in the R Language, we use the stringdist() function of the stringdist package library. The stringdist() function takes two string vectors as arguments and returns a vector that contains the Levenshtein distance between each string pair in them.

Is Levenshtein distance NLP?

The Levenshtein distance used as a metric provides a boost to accuracy of an NLP model by verifying each named entity in the entry. The vector search solution does a good job, and finds the most similar entry as defined by the vectorization.

Is edit distance same as Levenshtein?

Different definitions of an edit distance use different sets of string operations. Levenshtein distance operations are the removal, insertion, or substitution of a character in the string. Being the most common metric, the term Levenshtein distance is often used interchangeably with edit distance.

What is hamming and Levenshtein distance?

The Hamming distance is the number of positions at which the corresponding symbols in the two strings are different. The Levenshtein distance between two strings is no greater than the sum of their Levenshtein distances from a third string (triangle inequality).


3 Answers

And stringdist in the stringdist package does it too, even faster than levenshteinDist under certain conditions (1)

like image 78
Ben Avatar answered Oct 19 '22 21:10

Ben


levenshteinDist (from the RecordLinkage package) calls compiled C code. Give it a try.

like image 45
George Dontas Avatar answered Oct 19 '22 23:10

George Dontas


You could try stringDist from Biostrings as well

like image 6
Aaron Statham Avatar answered Oct 19 '22 22:10

Aaron Statham