Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nested ifelse statement

I'm still learning how to translate a SAS code into R and I get warnings. I need to understand where I'm making mistakes. What I want to do is create a variable which summarizes and differentiates 3 status of a population: mainland, overseas, foreigner. I have a database with 2 variables:

  • id nationality: idnat (french, foreigner),

If idnat is french then:

  • id birthplace: idbp (mainland, colony, overseas)

I want to summarize the info from idnat and idbp into a new variable called idnat2:

  • status: k (mainland, overseas, foreigner)

All these variables use "character type".

Results expected in column idnat2 :

   idnat     idbp   idnat2 1  french mainland mainland 2  french   colony overseas 3  french overseas overseas 4 foreign  foreign  foreign 

Here is my SAS code I want to translate in R:

if idnat = "french" then do;    if idbp in ("overseas","colony") then idnat2 = "overseas";    else idnat2 = "mainland"; end; else idnat2 = "foreigner"; run; 

Here is my attempt in R:

if(idnat=="french"){     idnat2 <- "mainland" } else if(idbp=="overseas"|idbp=="colony"){     idnat2 <- "overseas" } else {     idnat2 <- "foreigner" } 

I receive this warning:

Warning message: In if (idnat=="french") { :   the condition has length > 1 and only the first element will be used 

I was advised to use a "nested ifelse" instead for its easiness but get more warnings:

idnat2 <- ifelse (idnat=="french", "mainland",         ifelse (idbp=="overseas"|idbp=="colony", "overseas")       )             else (idnat2 <- "foreigner") 

According to the Warning message, the length is greater than 1 so only what's between the first brackets will be taken into account. Sorry but I don't understand what this length has to do with here? Anybody know where I'm wrong?

like image 571
balour Avatar asked Aug 02 '13 08:08

balour


People also ask

How do I create a nested Ifelse in R?

Nested if...else Statements in RIf x is greater than 0, the code inside the outer if block is executed. Otherwise, the code inside the outer else block is executed. The inner if...else block checks whether x is even or odd. If x is perfectly divisible by 2, the code inside the inner if block is executed.

What is a nested IF statement?

Nested IF functions, meaning one IF function inside of another, allow you to test multiple criteria and increases the number of possible outcomes.

Which statement is correct for nested if else?

nested-if in C/C++ A nested if in C is an if statement that is the target of another if statement. Nested if statements mean an if statement inside another if statement. Yes, both C and C++ allow us to nested if statements within if statements, i.e, we can place an if statement inside another if statement.

What is the syntax of Ifelse () function?

Use the IF function, one of the logical functions, to return one value if a condition is true and another value if it's false. For example: =IF(A2>B2,"Over Budget","OK")


1 Answers

If you are using any spreadsheet application there is a basic function if() with syntax:

if(<condition>, <yes>, <no>) 

Syntax is exactly the same for ifelse() in R:

ifelse(<condition>, <yes>, <no>) 

The only difference to if() in spreadsheet application is that R ifelse() is vectorized (takes vectors as input and return vector on output). Consider the following comparison of formulas in spreadsheet application and in R for an example where we would like to compare if a > b and return 1 if yes and 0 if not.

In spreadsheet:

  A  B C 1 3  1 =if(A1 > B1, 1, 0) 2 2  2 =if(A2 > B2, 1, 0) 3 1  3 =if(A3 > B3, 1, 0) 

In R:

> a <- 3:1; b <- 1:3 > ifelse(a > b, 1, 0) [1] 1 0 0 

ifelse() can be nested in many ways:

ifelse(<condition>, <yes>, ifelse(<condition>, <yes>, <no>))  ifelse(<condition>, ifelse(<condition>, <yes>, <no>), <no>)  ifelse(<condition>,         ifelse(<condition>, <yes>, <no>),         ifelse(<condition>, <yes>, <no>)       )  ifelse(<condition>, <yes>,         ifelse(<condition>, <yes>,                ifelse(<condition>, <yes>, <no>)              )        ) 

To calculate column idnat2 you can:

df <- read.table(header=TRUE, text=" idnat idbp idnat2 french mainland mainland french colony overseas french overseas overseas foreign foreign foreign" )  with(df,       ifelse(idnat=="french",        ifelse(idbp %in% c("overseas","colony"),"overseas","mainland"),"foreign")      ) 

R Documentation

What is the condition has length > 1 and only the first element will be used? Let's see:

> # What is first condition really testing? > with(df, idnat=="french") [1]  TRUE  TRUE  TRUE FALSE > # This is result of vectorized function - equality of all elements in idnat and  > # string "french" is tested. > # Vector of logical values is returned (has the same length as idnat) > df$idnat2 <- with(df, +   if(idnat=="french"){ +   idnat2 <- "xxx" +   } +   ) Warning message: In if (idnat == "french") { :   the condition has length > 1 and only the first element will be used > # Note that the first element of comparison is TRUE and that's whay we get: > df     idnat     idbp idnat2 1  french mainland    xxx 2  french   colony    xxx 3  french overseas    xxx 4 foreign  foreign    xxx > # There is really logic in it, you have to get used to it 

Can I still use if()? Yes, you can, but the syntax is not so cool :)

test <- function(x) {   if(x=="french") {     "french"   } else{     "not really french"   } }  apply(array(df[["idnat"]]),MARGIN=1, FUN=test) 

If you are familiar with SQL, you can also use CASE statement in sqldf package.

like image 121
Tomas Greif Avatar answered Sep 21 '22 03:09

Tomas Greif