R: agrep results quantifier

Question

Is there a built-in way to quantify results of agrep function? E.g. in

agrep("test", c("tesr", "teqr", "toar"), max = 2, v=T)
[1] "tesr" "teqr"

tesr is only 1 char permutation away from test, while teqr is 2, and toar is 3 and hence not found. Apparently, tesr has higher "probability" than teqr. How can it be retrieved either in number of permutations or percentage? Thanks!

Edit: Apologies for not putting this in question in first place. I am already running a two-step procedure: agrep to get my list, and then adist to get N permutations. adist is slower, running time is a big factor in my dataset

Steven Beaupré · Accepted Answer

Another option using adist():

s <- c("tesr", "teqr", "toar")
s[adist("test", s) < 3]

Or using stringdist

library(stringdist)
s[stringdist("test", s, method = "lv") < 3]

Which gives:

#[1] "tesr" "teqr"

Benchmark

x <- rep(s, 10e5)
library(microbenchmark)
mbm <- microbenchmark(
  levenshteinDist = x[which(levenshteinDist("test", x) < 3)],
  adist = x[adist("test", x) < 3],
  stringdist = x[stringdist("test", x, method = "lv") < 3],
  times = 10
)

Which gives: enter image description here

Unit: milliseconds
            expr       min        lq      mean    median        uq       max neval cld
 levenshteinDist  840.7897 1255.1183 1406.8887 1398.4502 1510.5398 1960.4730    10  b 
           adist 2760.7677 2905.5958 2993.9021 2986.1997 3038.7692 3472.7767    10   c
      stringdist  145.8252  155.3228  210.4206  174.5924  294.8686  355.1552    10 a

vpipkt · Answer

The Levenshtein distance is the number of edits from one string to another. The package 'RecordLinkage' may be of interest. It provides the edit distance computation below, which should perform on par with agrep. Although it will not return the same results as agrep.

library(RecordLinkage)
ld <- levenshteinDist("test", c("tesr", "teqr", "toar"))
c("tesr", "teqr", "toar")[which(ld < 3)]

R: agrep results quantifier

Tags:

r

agrep

Alexey Ferapontov

2 Answers

Steven Beaupré

vpipkt

Recent Activity

Donate For Us

R: agrep results quantifier

Tags:

r

agrep

Alexey Ferapontov

2 Answers

Steven Beaupré

vpipkt

Related questions

Recent Activity

Donate For Us