Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting a string on the first space

Tags:

regex

r

I'd like to split a vector of character strings (people's names) into two columns (vectors). The problem is some people have a 'two word' last name. I'd like to split the first and last names into two columns. I can slit out and take the first names using the code below but the last name eludes me. (look at obs 29 in the sample set below to get an idea as the Ford has a "last name" of Pantera L that must be kept together)

What I have attempted to do so far;

x<-rownames(mtcars) unlist(strsplit(x, " .*")) 

What I'd like it to look like:

            MANUF       MAKE 27          Porsche     914-2 28          Lotus       Europa 29          Ford        Pantera L 30          Ferrari     Dino 31          Maserati    Bora 32          Volvo       142E 
like image 403
Tyler Rinker Avatar asked Nov 28 '11 17:11

Tyler Rinker


People also ask

How do I split a string with first space?

Using the split() Method For example, if we put the limit as n (n >0), it means that the pattern will be applied at most n-1 times. Here, we'll be using space (” “) as a regular expression to split the String on the first occurrence of space.

How do you split a string with spaces?

You can split a String by whitespaces or tabs in Java by using the split() method of java. lang. String class. This method accepts a regular expression and you can pass a regex matching with whitespace to split the String where words are separated by spaces.

How do you get the first element to split?

To split a string and get the first element of the array, call the split() method on the string, passing it the separator as a parameter, and access the array element at index 0 . For example, str. split(',')[0] splits the string on each comma and returns the first array element. Copied!


1 Answers

The regular expression rexp matches the word at the start of the string, an optional space, then the rest of the string. The parenthesis are subexpressions accessed as backreferences \\1 and \\2.

rexp <- "^(\\w+)\\s?(.*)$" y <- data.frame(MANUF=sub(rexp,"\\1",x), MAKE=sub(rexp,"\\2",x)) tail(y) #       MANUF      MAKE # 27  Porsche     914-2 # 28    Lotus    Europa # 29     Ford Pantera L # 30  Ferrari      Dino # 31 Maserati      Bora # 32    Volvo      142E 
like image 144
Joshua Ulrich Avatar answered Oct 05 '22 17:10

Joshua Ulrich