In R, I have two character vectors, a and b.
a <- c("abcdefg", "hijklmnop", "qrstuvwxyz")
b <- c("abXdeXg", "hiXklXnoX", "Xrstuvwxyz")
I want a function that counts the character mismatches between each element of a and the corresponding element of b. Using the example above, such a function should return c(2,3,1)
. There is no need to align the strings.
I need to compare each pair of strings character-by-character and count matches and/or mismatches in each pair. Does any such function exist in R?
Or, to ask the question in another way, is there a function to give me the edit distance between two strings, where the only allowed operation is substitution (ignore insertions or deletions)?
Using some mapply
fun:
mapply(function(x,y) sum(x!=y),strsplit(a,""),strsplit(b,""))
#[1] 2 3 1
Another option is to use adist
which Compute the approximate string distance between character vectors:
mapply(adist,a,b)
abcdefg hijklmnop qrstuvwxyz
2 3 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With