I don’t believe the standard library provides anything to compute the distance between two strings and I can’t seem to find anything in Boost StringAlgo. So, is there any other library I could use?
I am not too picky about the algorithm. Jaro-Winkler is fine, Levenshtein too, and I am open to suggestions, I don’t want to code something that someone has already coded..
You don't define your question with an actual distance metric, so I presume it just has to satisfy the conditions in "Metric (mathematics)":
A metric on a set X is a function (called the distance function or simply distance) d : X × X → R (where R is the set of real numbers). For all x, y, z in X, this function is required to satisfy the following conditions:
- d(x, y) ≥ 0 (non-negativity, or separation axiom)
- d(x, y) = 0 if and only if x = y (identity of indiscernibles, or coincidence axiom)
- d(x, y) = d(y, x) (symmetry)
- d(x, z) ≤ d(x, y) + d(y, z) (subadditivity / triangle inequality).
Suppose we define d
as such:
{ 0 if x = y
d(x, y) = {
{ 1 otherwise
So the first three conditions are satisfied:
d(x, y) ≥ 0
d(x, y) = 0 iff x = y
d(x, y) = d(y, x) = 0 for x = y
, and d(x, y) = d(y, x) = 1 for x ≠ y
For the last condition, there are two cases:
d(x, z) = 0
. The only conceivable values for the right-hand side are 0
, 1
, and 2
, any of which would satisfy the condition.d(x, z) = 1
. Suppose that the right-hand side isn't greater than or equal to one. This means that it must be zero. Then both terms on the right hand side would each have to be 0
, which means that x = y
and y = z
. The second condition means that x = z
, which in turn means that d(x, z) = 0
. This is a contradiction, so the right-hand side must be greater than or equal to one.Then we can define the metric as:
int d(std::string x, std::string y) {
if (x == y) {
return 0;
} else {
return 1;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With