Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter dataframe using global variable with the same name as column name [duplicate]

Tags:

r

dplyr

library(dplyr)

Toy dataset:

df <- data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
df
  x y
1 1 4
2 2 5
3 3 6

This works fine:

df %>% filter(y == 5)
  x y
1 2 5

This also works fine:

z <- 5
df %>% filter(y == z)
  x y
1 2 5

But this fails

y <- 5
df %>% filter(y == y)
  x y
1 1 4
2 2 5
3 3 6

Apparently, dplyr cannot make the distinction between its column y and the global variable y. Is there a way to tell dplyr that the second y is the global variable?

like image 982
Marco Avatar asked Oct 21 '16 06:10

Marco


2 Answers

You can do:

df %>% filter(y == .GlobalEnv$y)

or:

df %>% filter(y == .GlobalEnv[["y"]])

or:

both of which work in this context, but won't if all this is going on inside a function. But get will:

df %>% filter(y == get("y"))
f = function(df, y){df %>% filter(y==get("y"))}

So use get.

Or just use df[df$y==y,] instead of dplyr.

like image 81
Spacedman Avatar answered Nov 20 '22 18:11

Spacedman


The global environment can be accessed via the .GlobalEnv object:

> filter(df, y==.GlobalEnv$y)
  x y
1 2 5

Interestingly, using the accessor function globalenv() as a substitute for .GlobalEnv doesn't work in this scenario.

like image 38
Hong Ooi Avatar answered Nov 20 '22 19:11

Hong Ooi