Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to simplify handling with nested ifelse() structures in base R?

Tags:

r

if-statement

I'm facing nested ifelse() structures:

df1$var <- ifelse(x < a, u, ifelse(x < b, v, ifelse(x < c, w, ...)))

whereby the u, v, w, ...s are actually functions.

A dumbed down working example would be

df1 <- data.frame(x = rbinom(100, 5, .5))
df1$y <- ifelse(x == 1, "s", ifelse(x == 2, "t", 
                                    ifelse(x == 3, "u", ifelse(x == 4, "v", "w"))))

I presume there could be ideally a base R method (for sake of speed) to simplify such code; eventually a function as

rave.ifelse(x, 1=s, 2=t, ...)

I took a glance at cut(x, 5) but it confused me from this point of view.

Note: Values of x could be either numbers or factors, == could also be any logical operator and the s, t, ... are actually functions.

edit:

Note: The number of ifelse()s is known and large. The solution really should fit to the df1$var <- ifelse(x < a, u, ifelse(x < b, v, ifelse(x < c, w, ...))) situation, when the u, v, w, ...s are functions, e.g. u=sample(0:9, 1), v=runif(1),.... It should not be significantly slower than ifelse().

like image 455
jay.sf Avatar asked Dec 19 '22 01:12

jay.sf


2 Answers

You could use case_when from the dplyr library:

df1$y <- case_when(
    x == 1 ~ "s",
    x == 2 ~ "t",
    x == 3 ~ "u",
    x == 4 ~ "v",
    TRUE ~ "w"
)

Note that the final case above (TRUE) is the blanket else condition which will catch all cases not matching any earlier conditions.

like image 89
Tim Biegeleisen Avatar answered May 08 '23 23:05

Tim Biegeleisen


Since you insist on base R, here are two possibilities:

Define a mapping data.frame:

# Define mapping
map <- cbind.data.frame(
    x = c(1, 2, 3, 4, NA),
    y = c("s", "t", "u", "v", "w"));

Method 1: match entries from map to df1.

# match entries
df1$y <- map[match(df1$x, map$x), 2];
df1$y[is.na(df1$y2)] <- "w";

Method 2: Loop through all mappings, and replace using direct indexing:

# for loop
df1$y <- factor("w", levels = map$y);
for (i in 1:nrow(map)) df1$y[df1$x == map$x[i]] <- map$y[i];

Output:

tail(df1);
#    x y
#95  4 v
#96  1 s
#97  4 v
#98  2 t
#99  4 v
#100 1 s

Note, the second method will also work for inequalities.


Sample data

set.seed(2017);
df1 <- data.frame(x = rbinom(100, 5, .5))
like image 32
Maurits Evers Avatar answered May 08 '23 21:05

Maurits Evers