Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ifelse statement inside apply returns unexpected result

Tags:

r

I am trying to use an ifelse statement inside apply and am getting an odd result. I get the expected answer if the variable marker is 1, but not when that variable is > 9.

Here is an example data set for which I get the correct answer:

my.data <- read.table(text = '
   REFNO   status    stage   marker   cumulative   newstage
 1018567      ccc       AA        0             1         AA
 1018567      aaa     NONE        0             1       NONE
 1018567      aaa       BB        1             1         BB
 1018567      bbb       CC        1             1         CC
 1018567      eee       CC        1             1         CC
 1018567      mmm       CC        1             1         CC
 1018567      ppp       CC        1             1         CC
 1019711      ddd       CC        1             1         CC
', header = TRUE, stringsAsFactors = FALSE)

my.data$newstage <- apply(my.data, 1, function(x) ifelse(x['status'] == 'aaa'  & 
                                          x['stage']      == 'NONE' & 
                                          x['marker']     == 0      & 
                                          x['cumulative'] > 0, 'BB', x['stage']))

my.data

The data set below differs in only one element from that above, but I do not obtain the correct answer.

my.data <- read.table(text = '
   REFNO   status    stage   marker    cumulative   newstage
 1018567      ccc       AA        0             1         AA
 1018567      aaa     NONE        0             1       NONE
 1018567      aaa       BB        1             1         BB
 1018567      bbb       CC        1             1         CC
 1018567      eee       CC        1             1         CC
 1018567      mmm       CC        1             1         CC
 1018567      ppp       CC        1             1         CC
 1019711      ddd       CC       14             1         CC
', header = TRUE, stringsAsFactors = FALSE)

my.data$newstage <- apply(my.data, 1, function(x) ifelse(x['status'] == 'aaa'  & 
                                          x['stage']      == 'NONE' & 
                                          x['marker']     == 0      & 
                                          x['cumulative'] > 0, 'BB', x['stage']))

my.data

Thank you for any suggestions. Perhaps I should be using an if statement instead of an if-else?

Specifically, I would like NONE to be replaced with BB for newstage in the second row.

like image 324
Mark Miller Avatar asked Mar 19 '23 09:03

Mark Miller


1 Answers

If you look at apply(my.data2, 1, function(x) x), the marker column has two characters instead of one. This is because of the two digit 14. The coercion to character pads the column with spaces to the length of its longest (most characters) element. This produces " 0" == 0 in your code, which is FALSE. However, "0" == 0 is TRUE

" 0" == 0
# [1] FALSE
"0" == 0
# [1] TRUE

Since ifelse is vectorized, you don't need to use apply at all. You could add the new column with within (or with, as akrun mentions) or simply with newstage <- ifelse(...)

within(my.data2, {
    newStage <- ifelse(status == "aaa" & stage == "NONE" & marker == 0 & 
                           cumulative > 0, "BB", stage)
})
#     REFNO status stage marker cumulative newstage newStage
# 1 1018567    ccc    AA      0          1       AA       AA
# 2 1018567    aaa  NONE      0          1     NONE       BB
# 3 1018567    aaa    BB      1          1       BB       BB
# 4 1018567    bbb    CC      1          1       CC       CC
# 5 1018567    eee    CC      1          1       CC       CC
# 6 1018567    mmm    CC      1          1       CC       CC
# 7 1018567    ppp    CC      1          1       CC       CC
# 8 1019711    ddd    CC     14          1       CC       CC
like image 151
Rich Scriven Avatar answered Apr 01 '23 21:04

Rich Scriven