Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does the error "the condition has length > 1 and only the first element will be used" mean? [duplicate]

Here is my data set:

FullName <- c("Jimmy John Cephus", "Frank Chester", "Hank Chester", "Brody Buck Clyde", "Merle Rufus Roscoe Jed Quaid")
df <- data.frame(FullName)

Goal: Look into FullName for any spaces, " ", and extract out the FirstName.

My first step is to utilize the stringr library because I will utilize the str_count() and word() functions.

Next I test stringr::str_count(df$FullName, " ") against the df and R returns:

[1] 2 1 1 2 4

This is what I expect.

Next I test the word() function:

stringr::word(df$FullName, 1)

R returns:

[1] "Jimmy" "Frank" "Hank"  "Brody" "Merle"

Again, this is what I expect.

Next I construct a simple UDF (user defined function) that incorporates the str_count() function:

split_firstname = function(full_name){
  x <- stringr::str_count(full_name, " ")
  return(x)
}
split_firstname(df$FullName)

Again, R provides what I expect:

[1] 2 1 1 2 4

As a final step, I incorporate the word() function into the UDF and code for all of the conditions:

    split_firstname = function(full_name){
  x <- stringr::str_count(full_name, " ")
  if(x==1){
    return(stringr::word(full_name,1))
  }else if(x==2){
    return(paste(stringr::word(full_name,1), stringr::word(full_name,2), sep = " "))
  }else if(x==4){
    return(paste(stringr::word(full_name,1), stringr::word(full_name,2), stringr::word(full_name,3), stringr::word(full_name,4), sep = " "))
  }
}

Then I call the UDF and pass to it the FullName from the df:

split_firstname(df$FullName)

This time I did NOT get what I expected, R returned:

[1] "Jimmy John"    "Frank Chester" "Hank Chester"  "Brody Buck"    "Merle Rufus"  
Warning messages:
1: In if (x == 1) { :
  the condition has length > 1 and only the first element will be used
2: In if (x == 2) { :
  the condition has length > 1 and only the first element will be used

I had expected R to return to me the following:

"Jimmy John", "Frank", "Hank", "Brody Buck", "Merle Rufus Roscoe Jed"
like image 403
G-Bruce Avatar asked Oct 31 '17 12:10

G-Bruce


2 Answers

the problem is that you are using an if-statement with a vector. This is not allowed and doesn't work as you would expect. You can use the case_when function from dplyr.

library(dplyr)

split_firstname <- function(full_name){
  x <- stringr::str_count(full_name, " ")
  case_when(
    x == 1 ~ stringr::word(full_name, 1),
    x == 2 ~ paste(stringr::word(full_name,1), stringr::word(full_name,2), sep = " "),
    x == 4 ~ paste(stringr::word(full_name,1), stringr::word(full_name,2), stringr::word(full_name,3), stringr::word(full_name,4), sep = " ")
  )
}
like image 87
amarchin Avatar answered Nov 09 '22 18:11

amarchin


lukeA's answer is the best approach, but if you find you are unable to vectorise functions, sapply from base-r and rowwise from dplyr can solve this problem too

df$first <- sapply(df$FullName, split_firstname)
head(df)
                      FullName                  first
1            Jimmy John Cephus             Jimmy John
2                Frank Chester                  Frank
3                 Hank Chester                   Hank
4             Brody Buck Clyde             Brody Buck
5 Merle Rufus Roscoe Jed Quaid Merle Rufus Roscoe Jed

library(dplyr)

df <- df %>% rowwise() %>% 
  mutate(split2 = split_firstname(FullName))

head(df)
                      FullName                  first                 split2
                        <fctr>                  <chr>                  <chr>
1            Jimmy John Cephus             Jimmy John             Jimmy John
2                Frank Chester                  Frank                  Frank
3                 Hank Chester                   Hank                   Hank
4             Brody Buck Clyde             Brody Buck             Brody Buck
5 Merle Rufus Roscoe Jed Quaid Merle Rufus Roscoe Jed Merle Rufus Roscoe Jed
like image 32
r.bot Avatar answered Nov 09 '22 18:11

r.bot