I have this string vector (for example):
str <- c("this is a string current trey",
"feather rtttt",
"tusla",
"laq")
To count the number of words in this vector I used this (as given here Count the number of words in a string in R?, which is a possible duplicate but with another issue)
No_words <- sapply(gregexpr("\\W+", str), length) + 1
but it returns
6 2 2 2
String has only 1 element in last two places (i.e. "tusla"
and "laq"
)
so it should return
6 2 1 1
How do I get around this problem?
Use the stringi
package and stri_count
:
require(stringi)
str <- c(
"this is a string current trey",
"nospaces",
"multiple spaces",
" leadingspaces",
"trailingspaces ",
" leading and trailing ",
"just one space each")
> stri_count(str,regex="\\S+")
[1] 6 1 2 1 1 3 4
Use the wc-function from the qdap package.
str <- c("this is a string current trey",
"feather rtttt",
"tusla",
"laq")
library("qdap")
wc(str)
That returns:
wc(str) [1] 6 2 1 1
You can try
sapply(gregexpr("\\S+", x), length)
## [1] 6 2 1 1
Or as suggested in comments you can try
sapply(strsplit(x, "\\s+"), length)
## [1] 6 2 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With