I can't use switch inside of mutate
because it returns the whole vector instead of just the row. As a hack, I'm using:
pick <- function(x, v1, v2, v3, v4) { ifelse(x == 1, v1, ifelse(x == 2, v2, ifelse(x == 3, v3, ifelse(x == 4, v4, NA)))) }
This works inside of mutate
, and is fine for now because I'm typically choosing among 4 things, but that may change. Can you recommend an alternative?
For example:
library(dplyr) df.faithful <- tbl_df(faithful) df.faithful$x <- sample(1:4, 272, rep=TRUE) df.faithful$y1 <- rnorm(n=272, mean=7, sd=2) df.faithful$y2 <- rnorm(n=272, mean=5, sd=2) df.faithful$y3 <- rnorm(n=272, mean=7, sd=1) df.faithful$y4 <- rnorm(n=272, mean=5, sd=1)
Using pick
:
mutate(df.faithful, y = pick(x, y1, y2, y3, y4)) Source: local data frame [272 x 8] eruptions waiting x y1 y2 y3 y4 y 1 3.600 79 1 8.439092 5.7753006 8.319372 5.078558 8.439092 2 1.800 54 2 13.515956 6.1971512 6.343157 4.962349 6.197151 3 3.333 74 4 7.693941 6.8973365 5.406684 5.425404 5.425404 4 2.283 62 4 12.595852 6.9953995 7.864423 3.730967 3.730967 5 4.533 85 3 11.952922 5.1512987 9.177687 5.511899 9.177687 6 2.883 55 3 7.881350 1.0289711 6.304004 3.554056 6.304004 7 4.700 88 4 8.636709 6.3046198 6.788619 5.748269 5.748269 8 3.600 85 1 8.027371 6.3535056 7.152698 7.034976 8.027371 9 1.950 51 1 5.863370 0.1707758 5.750440 5.058107 5.863370 10 4.350 85 1 7.761653 6.2176610 8.348378 1.861112 7.761653 .. ... ... . ... ... ... ... ...
We see that I copy the value from y1 into y if x == 1, and so on. This is what I'm looking to do, but want to be able to do it, whether I have a list of 4 or 400 columns.
Trying to use switch
:
mutate(df.faithful, y = switch(x, y1, y2, y3, 4)) Error in switch(c(1L, 2L, 4L, 4L, 3L, 3L, 4L, 1L, 1L, 1L, 4L, 3L, 1L, : EXPR must be a length 1 vector
Trying to use list
:
mutate(df.faithful, y = list(y1, y2, y3, y4)[[x]]) Error in list(c(8.43909205142925, 13.5159559591257, 7.69394050059568, : recursive indexing failed at level 2
Trying to use c
:
mutate(df.faithful, y = c(y1, y2, y3, y4)[x]) Source: local data frame [272 x 8] eruptions waiting x y1 y2 y3 y4 y 1 3.600 79 1 8.439092 5.7753006 8.319372 5.078558 8.439092 2 1.800 54 2 13.515956 6.1971512 6.343157 4.962349 13.515956 3 3.333 74 4 7.693941 6.8973365 5.406684 5.425404 12.595852 4 2.283 62 4 12.595852 6.9953995 7.864423 3.730967 12.595852 5 4.533 85 3 11.952922 5.1512987 9.177687 5.511899 7.693941 6 2.883 55 3 7.881350 1.0289711 6.304004 3.554056 7.693941 7 4.700 88 4 8.636709 6.3046198 6.788619 5.748269 12.595852 8 3.600 85 1 8.027371 6.3535056 7.152698 7.034976 8.439092 9 1.950 51 1 5.863370 0.1707758 5.750440 5.058107 8.439092 10 4.350 85 1 7.761653 6.2176610 8.348378 1.861112 8.439092 .. ... ... . ... ... ... ... ...
No errors are produced, but the behavior is not as intended.
mutate() adds new variables and preserves existing ones; transmute() adds new variables and drops existing ones. New variables overwrite existing variables of the same name. Variables can be removed by setting their value to NULL .
Add a column to a dataframe in R using dplyr. In my opinion, the best way to add a column to a dataframe in R is with the mutate() function from dplyr .
case_when.Rd. This function allows you to vectorise multiple if_else() statements. It is an R equivalent of the SQL CASE WHEN statement.
mutate() is a dplyr function that adds new variables and preserves existing ones. That's what the documentation says. So when you want to add new variables or change one already in the dataset, that's your good ally.
Eons too late for the OP, but in case this shows up in a search ...
dplyr v0.5 has recode()
, a vectorized version of switch()
, so
data_frame( x = sample(1:4, 10, replace=TRUE), y1 = rnorm(n=10, mean=7, sd=2), y2 = rnorm(n=10, mean=5, sd=2), y3 = rnorm(n=10, mean=7, sd=1), y4 = rnorm(n=10, mean=5, sd=1) ) %>% mutate(y = recode(x,y1,y2,y3,y4))
produces, as anticipated:
# A tibble: 10 x 6 x y1 y2 y3 y4 y <int> <dbl> <dbl> <dbl> <dbl> <dbl> 1 2 6.950106 6.986780 7.826778 6.317968 6.986780 2 1 5.776381 7.706869 7.982543 5.048649 5.776381 3 2 7.315477 2.213855 6.079149 6.070598 2.213855 4 3 7.461220 5.100436 7.085912 4.440829 7.085912 5 3 5.780493 4.562824 8.311047 5.612913 8.311047 6 3 5.373197 7.657016 7.049352 4.470906 7.049352 7 2 6.604175 9.905151 8.359549 6.430572 9.905151 8 3 11.363914 4.721148 7.670825 5.317243 7.670825 9 3 10.123626 7.140874 6.718351 5.508875 6.718351 10 4 5.407502 4.650987 5.845482 4.797659 4.797659
(Also works with named args, including character and factor x's.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With