I have a dataset with the following structure:
Classes ‘tbl_df’ and 'data.frame': 10 obs. of 7 variables: $ GdeName : chr "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" ... $ Partei : chr "BDP" "CSP" "CVP" "EDU" ... $ Stand1971: num NA NA 4.91 NA 3.21 ... $ Stand1975: num NA NA 5.389 0.438 4.536 ... $ Stand1979: num NA NA 6.2774 0.0195 3.4355 ... $ Stand1983: num NA NA 4.66 1.41 3.76 ... $ Stand1987: num NA NA 3.48 1.65 5.75 ...
I want to provide a function which allows to compute the difference between any value, and I would like to do this using dplyr
s mutate
function like so: (assume the parameters from
and to
are passed as arguments)
from <- "Stand1971" to <- "Stand1987" data %>% mutate(diff = from - to)
Of course, this doesn't work, as dplyr
uses non-standard evaluation. And I know there's now an elegant solution to the problem using mutate_
, and I've read this vignette, but I still can't get my head around it.
What to do?
Here's the first few rows of the dataset for a reproducible example
structure(list(GdeName = c("Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis" ), Partei = c("BDP", "CSP", "CVP", "EDU", "EVP", "FDP", "FGA", "FPS", "GLP", "GPS"), Stand1971 = c(NA, NA, 4.907306434, NA, 3.2109535926, 18.272143463, NA, NA, NA, NA), Stand1975 = c(NA, NA, 5.389079711, 0.4382328556, 4.5363022622, 18.749259742, NA, NA, NA, NA), Stand1979 = c(NA, NA, 6.2773722628, 0.0194647202, 3.4355231144, 25.294403893, NA, NA, NA, 2.7055961071), Stand1983 = c(NA, NA, 4.6609804428, 1.412940467, 3.7563539244, 26.277246489, 0.8529335746, NA, NA, 2.601878177), Stand1987 = c(NA, NA, 3.4767860929, 1.6535933856, 5.7451770193, 22.146844746, NA, 3.7453183521, NA, 13.702211858 )), .Names = c("GdeName", "Partei", "Stand1971", "Stand1975", "Stand1979", "Stand1983", "Stand1987"), class = c("tbl_df", "data.frame" ), row.names = c(NA, -10L))
Using the latest version of dplyr (>=0.7), you can use the rlang
!!
(bang-bang) operator.
library(tidyverse) from <- "Stand1971" to <- "Stand1987" data %>% mutate(diff=(!!as.name(from))-(!!as.name(to)))
You just need to convert the strings to names with as.name
and then insert them into the expression. Unfortunately I seem to have to use a few more parenthesis than I would like, but the !!
operator seems to fall in a weird order-of-operations order.
Original answer, dplyr (0.3-<0.7):
From that vignette (vignette("nse","dplyr")
), use lazyeval's interp()
function
library(lazyeval) from <- "Stand1971" to <- "Stand1987" data %>% mutate_(diff=interp(~from - to, from=as.name(from), to=as.name(to)))
You can use .data
inside dplyr
chain now.
library(dplyr) from <- "Stand1971" to <- "Stand1987" data %>% mutate(diff = .data[[from]] - .data[[to]])
Another option is to use sym
with bang-bang (!!
)
data %>% mutate(diff = !!sym(from) - !!sym(to))
In base R, we can use :
data$diff <- data[[from]] - data[[to]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With