In base R one can easily filter to rows where two columns are equals like so:
mtcars[mtcars$cyl==mtcars$carb,]
Using dplyr
's filter
this can be done easily
mtcars %>% filter(cyl==carb)
But if I am writing a function using this code I would want to use filter_
, but this code doesn't work
mtcars %>% filter_("cyl"=="carb")
Since in this case it thinks "carb" is a value to test rather than a variable.
My question is how can you use filter_
to compare two variables in a data.frame?
Put the whole thing in quotes:
mtcars %>% filter_("cyl==carb")
Or, as effel has already suggested, this will also work:
mtcars %>% filter_(~cyl==carb)
There's more on this here.
It’s best to use a formula, because a formula captures both the expression to evaluate, and the environment in which it should be a evaluated. This is important if the expression is a mixture of variables in the data frame and objects in the local environment
library(dplyr)
airquality %>%
filter_(~Month == Day)
airquality %>% filter_(~Month == Day)
# Ozone Solar.R Wind Temp Month Day
# 1 NA NA 14.3 56 5 5
# 2 NA 264 14.3 79 6 6
# 3 77 276 5.1 88 7 7
# 4 89 229 10.3 90 8 8
# 5 21 230 10.9 75 9 9
Alternatively:
There are three ways to quote inputs that dplyr understands: With a formula,
~ mean(mpg)
. Withquote()
,quote(mean(mpg))
. As a string:"mean(mpg)"
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With