Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm / Library for measuring degree of equality of strings

Is there an algorithm that given two strings yields the degree of equality between them, applying metrics that can be provided externally? For example, the two strings "Plant code" and "PlantCode" could be 0.8 equal, "Plant code" and "Plant" could be 0.6 equal, "Truck no" and "shipment details" could be 0.6 equal (using extrenally provided synonyms dictionary). The numbers are made up, but I hope they get the point across. Does there exist such an algorithm? I'd prefer if it comes as a library, rather than having to implement it on my own. Any help would be greatly appreciated. Thanks.

like image 635
missingfaktor Avatar asked Dec 27 '22 13:12

missingfaktor


2 Answers

Try the Simmetrics library. It provides a whole number of simmilarity metrics.

like image 164
Dirk Avatar answered Dec 30 '22 11:12

Dirk


Maybe the google-diff-match-patch library can help: This library implements Myer's diff algorithm which is generally considered to be the best general-purpose diff.

like image 43
Frank Grimm Avatar answered Dec 30 '22 11:12

Frank Grimm