Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the length of a string in R

How to find the length of a string (i.e., number of characters in a string) without splitting it in R? I know how to find the length of a list but not of a string.

And what about Unicode strings? How do I find the length (in bytes) and the number of characters (runes, symbols) in a Unicode string?

Related Question:

  • How to find the "real" number of characters in a Unicode string in R
like image 750
Igor Chubin Avatar asked Jun 21 '12 09:06

Igor Chubin


People also ask

How do you find length in R?

R – Vector Length To get length of a vector in R programming, call length() function and pass the vector to it. length() function returns an integer, representing the length of vector.

What does length () do in R?

length() function in R Programming Language is used to get or set the length of a vector (list) or other objects.

Which command is used to find the length of a string in R?

Finding the length of string in R programming – nchar() method. nchar() method in R Programming Language is used to get the length of a character in a string object.

How do I find the length of a string in a column in R?

To find the maximum string length by column in the given dataframe, first, nchar() function is called to get the length of all the string present in the particular column of the dataframe, and then the max() function must be called to get the maximum value of the length of the string generated by the nchar() function.


2 Answers

See ?nchar. For example:

> nchar("foo") [1] 3 > set.seed(10) > strn <- paste(sample(LETTERS, 10), collapse = "") > strn [1] "NHKPBEFTLY" > nchar(strn) [1] 10 
like image 157
Gavin Simpson Avatar answered Oct 13 '22 18:10

Gavin Simpson


Use stringi package and stri_length function

> stri_length(c("ala ma kota","ABC",NA)) [1] 11  3 NA 

Why? Because it is the FASTEST among presented solutions :)

require(microbenchmark) require(stringi) require(stringr) x <- c(letters,NA,paste(sample(letters,2000,TRUE),collapse=" ")) microbenchmark(nchar(x),str_length(x),stri_length(x)) Unit: microseconds            expr    min     lq  median      uq     max neval        nchar(x) 11.868 12.776 13.1590 13.6475  41.815   100   str_length(x) 30.715 33.159 33.6825 34.1360 173.400   100  stri_length(x)  2.653  3.281  4.0495  4.5380  19.966   100 

and also works fine with NA's

nchar(NA) ## [1] 2 stri_length(NA) ## [1] NA 

EDIT 2021

NA argument is no longer valid if you are using latest R version.

like image 29
bartektartanus Avatar answered Oct 13 '22 17:10

bartektartanus