Extracting first names in R

Question

Say I have a vector of peoples' names in my dataframe:

names <- c("Bernice Ingram", "Dianna Dean", "Philip Williamson", "Laurie Abbott",
           "Rochelle Price", "Arturo Fisher", "Enrique Newton", "Sarah Mann",
           "Darryl Graham", "Arthur Hoffman")

I want to create a vector with the first names. All I know about them is that they come first in the vector above and that they're followed by a space. In other words, this is what I'm looking for:

"Bernice" "Dianna"  "Philip" "Laurie" "Rochelle"
"Arturo"  "Enrique" "Sarah"  "Darryl" "Arthur"

I've found a similar question here, but the answers (especially this one) haven't helped much. So far, I've tried a couple of variations of function from the grep family, and the closest I could get to something useful was by running strsplit(names, " ") to separate first names and then strsplit(names, " ")[[1]][1] to get just the first name of the first person. I've been trying to tweak this last command to give me a whole vector of first names, to no avail.

Michele · Accepted Answer

Use sapply to extract the first name:

> sapply(strsplit(names, " "), `[`, 1)
 [1] "Bernice"  "Dianna"   "Philip"   "Laurie"   "Rochelle" "Arturo"   "Enrique" 
 [8] "Sarah"    "Darryl"   "Arthur"

Some comments:

The above works just fine. To make it a bit more general you could change the split parameter in strsplit function from " " in "\s+" which covers multiple spaces. Then you also could use gsub to extract directly everything before a space. This last approach will use only one function call and likely to be faster (but I haven't check with benchmark).

A5C1D2H2I1M1N2O1R2T1 · Answer

For what you want, here's a pretty unorthodox way to do it:

read.table(text = names, header = FALSE, stringsAsFactors=FALSE, fill = TRUE)[[1]]
# [1] "Bernice"  "Dianna"   "Philip"   "Laurie"   "Rochelle" "Arturo"   "Enrique"  "Sarah"   
# [9] "Darryl"   "Arthur"

zzxx53 · Answer

This seems to work:

unlist(strsplit(names,' '))[seq(1,2*length(names),2)]

Assuming no first/last names have spaces in them.

Extracting first names in R

Tags:

regex

r

Waldir Leoncio

3 Answers

Michele

A5C1D2H2I1M1N2O1R2T1

zzxx53

Recent Activity

Donate For Us

Extracting first names in R

Tags:

regex

r

Waldir Leoncio

3 Answers

Michele

A5C1D2H2I1M1N2O1R2T1

zzxx53

Related questions

Recent Activity

Donate For Us