Count Pattern Matching in R

Question

How would one efficiently count the number of instances of one character string which occur within another character string?

Below is my code to date. It successfully identifies if any instance of the one string occurs in the other string. However, I do not know how to extend it from a TRUE/FALSE relationship to a counting relationship.

x <- ("Hello my name is Christopher. Some people call me Chris")
y <- ("Chris is an interesting person to be around")
z <- ("Because he plays sports and likes statistics")

lll <- tolower(list(x,y,z))
dict <- tolower(c("Chris", "Hell"))

mmm <- matrix(nrow=length(lll), ncol=length(dict), NA)

for (i in 1:length(lll)) {
for (j in 1:length(dict)) {
    mmm[i,j] <- sum(grepl(dict[j],lll[i]))
}
}
mmm

It yields:

       [,1] [,2]
 [1,]    1    1
 [2,]    1    0
 [3,]    0    0

Since the lower-case string "chris" appears twice in the lll[1] I would like mmm[1,1] to be 2 instead of 1.

Real example is much higher dimension...so would love if code could be vectorized instead of using my brute force for loops.

Ricardo Saporta · Accepted Answer

Two quick tips:

avoid the dual for-loop, you dont need it ;)
use the stringr package

library(stringr)

dict <- setNames(nm=dict)  # simply for neatness
lapply(dict, str_count, string=lll)
# $chris
# [1] 2 1 0
#
# $hell
# [1] 1 0 0

Or as a matrix:

#  sapply(dict, str_count, string=lll)
#      chris hell
# [1,]     2    1
# [2,]     1    0
# [3,]     0    0

Matthew Plourde · Answer

You can also do something like this:

count.matches <- function(pat, vec) sapply(regmatches(vec, gregexpr(pat, vec)), length)
mapply(count.matches, c('chris', 'hell'), list(lll))
#      chris hell
# [1,]     2    1
# [2,]     1    0
# [3,]     0    0

Count Pattern Matching in R

Tags:

regex

pattern-matching

r

Chris

2 Answers

Or as a matrix:

Ricardo Saporta

Matthew Plourde

Recent Activity

Donate For Us

Count Pattern Matching in R

Tags:

regex

pattern-matching

r

Chris

2 Answers

Or as a matrix:

Ricardo Saporta

Matthew Plourde

Related questions

Recent Activity

Donate For Us