I'm still learning how to translate a SAS code into R and I get warnings. I need to understand where I'm making mistakes. What I want to do is create a variable which summarizes and differentiates 3 status of a population: mainland, overseas, foreigner. I have a database with 2 variables:
idnat
(french, foreigner), If idnat
is french then:
idbp
(mainland, colony, overseas) I want to summarize the info from idnat
and idbp
into a new variable called idnat2
:
All these variables use "character type".
Results expected in column idnat2 :
idnat idbp idnat2 1 french mainland mainland 2 french colony overseas 3 french overseas overseas 4 foreign foreign foreign
Here is my SAS code I want to translate in R:
if idnat = "french" then do; if idbp in ("overseas","colony") then idnat2 = "overseas"; else idnat2 = "mainland"; end; else idnat2 = "foreigner"; run;
Here is my attempt in R:
if(idnat=="french"){ idnat2 <- "mainland" } else if(idbp=="overseas"|idbp=="colony"){ idnat2 <- "overseas" } else { idnat2 <- "foreigner" }
I receive this warning:
Warning message: In if (idnat=="french") { : the condition has length > 1 and only the first element will be used
I was advised to use a "nested ifelse
" instead for its easiness but get more warnings:
idnat2 <- ifelse (idnat=="french", "mainland", ifelse (idbp=="overseas"|idbp=="colony", "overseas") ) else (idnat2 <- "foreigner")
According to the Warning message, the length is greater than 1 so only what's between the first brackets will be taken into account. Sorry but I don't understand what this length has to do with here? Anybody know where I'm wrong?
Nested if...else Statements in RIf x is greater than 0, the code inside the outer if block is executed. Otherwise, the code inside the outer else block is executed. The inner if...else block checks whether x is even or odd. If x is perfectly divisible by 2, the code inside the inner if block is executed.
Nested IF functions, meaning one IF function inside of another, allow you to test multiple criteria and increases the number of possible outcomes.
nested-if in C/C++ A nested if in C is an if statement that is the target of another if statement. Nested if statements mean an if statement inside another if statement. Yes, both C and C++ allow us to nested if statements within if statements, i.e, we can place an if statement inside another if statement.
Use the IF function, one of the logical functions, to return one value if a condition is true and another value if it's false. For example: =IF(A2>B2,"Over Budget","OK")
If you are using any spreadsheet application there is a basic function if()
with syntax:
if(<condition>, <yes>, <no>)
Syntax is exactly the same for ifelse()
in R:
ifelse(<condition>, <yes>, <no>)
The only difference to if()
in spreadsheet application is that R ifelse()
is vectorized (takes vectors as input and return vector on output). Consider the following comparison of formulas in spreadsheet application and in R for an example where we would like to compare if a > b and return 1 if yes and 0 if not.
In spreadsheet:
A B C 1 3 1 =if(A1 > B1, 1, 0) 2 2 2 =if(A2 > B2, 1, 0) 3 1 3 =if(A3 > B3, 1, 0)
In R:
> a <- 3:1; b <- 1:3 > ifelse(a > b, 1, 0) [1] 1 0 0
ifelse()
can be nested in many ways:
ifelse(<condition>, <yes>, ifelse(<condition>, <yes>, <no>)) ifelse(<condition>, ifelse(<condition>, <yes>, <no>), <no>) ifelse(<condition>, ifelse(<condition>, <yes>, <no>), ifelse(<condition>, <yes>, <no>) ) ifelse(<condition>, <yes>, ifelse(<condition>, <yes>, ifelse(<condition>, <yes>, <no>) ) )
To calculate column idnat2
you can:
df <- read.table(header=TRUE, text=" idnat idbp idnat2 french mainland mainland french colony overseas french overseas overseas foreign foreign foreign" ) with(df, ifelse(idnat=="french", ifelse(idbp %in% c("overseas","colony"),"overseas","mainland"),"foreign") )
R Documentation
What is the condition has length > 1 and only the first element will be used
? Let's see:
> # What is first condition really testing? > with(df, idnat=="french") [1] TRUE TRUE TRUE FALSE > # This is result of vectorized function - equality of all elements in idnat and > # string "french" is tested. > # Vector of logical values is returned (has the same length as idnat) > df$idnat2 <- with(df, + if(idnat=="french"){ + idnat2 <- "xxx" + } + ) Warning message: In if (idnat == "french") { : the condition has length > 1 and only the first element will be used > # Note that the first element of comparison is TRUE and that's whay we get: > df idnat idbp idnat2 1 french mainland xxx 2 french colony xxx 3 french overseas xxx 4 foreign foreign xxx > # There is really logic in it, you have to get used to it
Can I still use if()
? Yes, you can, but the syntax is not so cool :)
test <- function(x) { if(x=="french") { "french" } else{ "not really french" } } apply(array(df[["idnat"]]),MARGIN=1, FUN=test)
If you are familiar with SQL, you can also use CASE
statement in sqldf
package.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With