Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R convert string to vector tokenize using " "

Tags:

I have a string :

string1 <- "This is my string" 

I would like to convert it to a vector that looks like this:

vector1 "This" "is" "my" "string" 

How do I do this? I know I could use the tm package to convert to termDocumentMatrix and then convert to a matrix but it would alphabetize the words and I need them to stay in the same order.

like image 660
screechOwl Avatar asked Aug 13 '12 01:08

screechOwl


People also ask

How do I turn a char into a vector in R?

To convert a given character vector into integer vector in R, call as. integer() function and pass the character vector as argument to this function. as. integer() returns a new vector with the character values transformed into integer values.

How do I convert text to numbers in R?

To convert String to Integer in R programming, call strtoi() function, pass the string and base values to this function. strtoi(string, base) returns the integer value of the given string with respect to the specified base.

How do you split a character vector in R?

Note that splitting into single characters can be done via split = character(0) or split = "" ; the two are equivalent.


2 Answers

You can use strsplit to accomplish this task.

string1 <- "This is my string" strsplit(string1, " ")[[1]] #[1] "This"   "is"     "my"     "string" 
like image 173
Dason Avatar answered Sep 30 '22 05:09

Dason


Slightly different from Dason, but this will split for any amount of white space including newlines:

string1 <- "This   is my string" strsplit(string1, "\\s+")[[1]] 
like image 21
Sacha Epskamp Avatar answered Sep 30 '22 06:09

Sacha Epskamp