In base R one can easily filter to rows where two columns are equals like so:
mtcars[mtcars$cyl==mtcars$carb,]
Using dplyr's filter this can be done easily
mtcars %>% filter(cyl==carb)
But if I am writing a function using this code I would want to use filter_, but this code doesn't work
mtcars %>% filter_("cyl"=="carb")
Since in this case it thinks "carb" is a value to test rather than a variable.
My question is how can you use filter_ to compare two variables in a data.frame?
Put the whole thing in quotes:
mtcars %>% filter_("cyl==carb")
Or, as effel has already suggested, this will also work:
mtcars %>% filter_(~cyl==carb)
                        There's more on this here.
It’s best to use a formula, because a formula captures both the expression to evaluate, and the environment in which it should be a evaluated. This is important if the expression is a mixture of variables in the data frame and objects in the local environment
library(dplyr)
airquality %>%
  filter_(~Month == Day)
airquality %>% filter_(~Month == Day)
#   Ozone Solar.R Wind Temp Month Day
# 1    NA      NA 14.3   56     5   5
# 2    NA     264 14.3   79     6   6
# 3    77     276  5.1   88     7   7
# 4    89     229 10.3   90     8   8
# 5    21     230 10.9   75     9   9
Alternatively:
There are three ways to quote inputs that dplyr understands: With a formula,
~ mean(mpg). Withquote(),quote(mean(mpg)). As a string:"mean(mpg)".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With