df <-
data.frame(a=LETTERS[1:4],
b=rnorm(4)
)
vals <- c("B","D")
I can filter/subset df
with values in val
with:
dplyr::filter(df, a %in% vals)
subset(df, a %in% vals)
Both gives:
a b
2 B 0.4481627
4 D 0.2916513
What if I have a variable name in a vector, e.g.:
> names(df)[1]
[1] "a"
Then it doesnt work - I guess because its quoted
dplyr::filter(df, names(df)[1] %in% vals)
[1] a b
<0 rows> (or 0-length row.names)
How do you do this ?
UPDATE ( what if its dplyr::tbl_df(df) )
Answers below work fine for data.frames, but not for dplyr::tbl_df wrapped data:
df<-dplyr::tbl_df(df)
dplyr::filter(df, df[,names(df)[1]] %in% vals)
Does not work (I thought tbl_df
was a simple wrap on top of df ? )
This does work again:
dplyr::filter(df, as.data.frame(df)[,names(df)[1]] %in% vals)
FINAL UPDATE: It works with tbl_df() using lazyeval::interp
See AndreyAkinshin's solution below.
The way you tell R that you want to select some particular elements (i.e., a 'subset') from a vector is by placing an 'index vector' in square brackets immediately following the name of the vector. For a simple example, try x[1:10] to view the first ten elements of x.
subset has a select argument. subset recycles its condition argument. filter supports conditions as separate arguments. filter preserves the class of the column.
Overview. The filter() method in R is used to subset a data frame based on a provided condition. If a row satisfies the condition, it must produce TRUE . Otherwise, non-satisfying rows will return NA values. Hence, the row will be dropped.
You can use df[,"a"]
or df[,1]
:
df <- data.frame(a = LETTERS[1:4], b = rnorm(4))
vals <- c("B","D")
dplyr::filter(df, df[,1] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
subset(df, df[,1] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
dplyr::filter(df, df[,"a"] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
subset(df, df[,"a"] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
Working with dplyr::tbl_df(df)
Some magic with lazyeval::interp
helps us!
df <- dplyr::tbl_df(df)
expr <- lazyeval::interp(quote(x %in% y), x = as.name(names(df)[1]), y = vals)
df %>% filter_(expr)
# Source: local data frame [2 x 2]
#
# a b
# 1 B 0.4481627
# 2 D 0.2916513
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With