Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select rows from a data frame based on values in a vector

Tags:

r

r-faq

subset

I have data similar to this:

dt <- structure(list(fct = structure(c(1L, 2L, 3L, 4L, 3L, 4L, 1L, 2L, 3L, 1L, 2L, 3L, 2L, 3L, 4L), .Label = c("a", "b", "c", "d"), class = "factor"), X = c(2L, 4L, 3L, 2L, 5L, 4L, 7L, 2L, 9L, 1L, 4L, 2L, 5L, 4L, 2L)), .Names = c("fct", "X"), class = "data.frame", row.names = c(NA, -15L)) 

I want to select rows from this data frame based on the values in the fct variable. For example, if I wish to select rows containing either "a" or "c" I can do this:

dt[dt$fct == 'a' | dt$fct == 'c', ] 

which yields

1    a 2 3    c 3 5    c 5 7    a 7 9    c 9 10   a 1 12   c 2 14   c 4 

as expected. But my actual data is more complex and I actually want to select rows based on the values in a vector such as

vc <- c('a', 'c') 

So I tried

dt[dt$fct == vc, ] 

but of course that doesn't work. I know I could code something to loop through the vector and pull out the rows needed and append them to a new dataframe, but I was hoping there was a more elegant way.

So how can I filter/subset my data based on the contents of the vector vc?

like image 338
Joe King Avatar asked Jul 23 '12 12:07

Joe King


People also ask

How do you subset a DataFrame based on a vector in R?

If we have a vector and a data frame, and the data frame has a column that contains the values similar as in the vector then we can create a subset of the data frame based on that vector. This can be done with the help of single square brackets and %in% operator.

How do I select a value from a vector in R?

The way you tell R that you want to select some particular elements (i.e., a 'subset') from a vector is by placing an 'index vector' in square brackets immediately following the name of the vector. For a simple example, try x[1:10] to view the first ten elements of x.


2 Answers

Have a look at ?"%in%".

dt[dt$fct %in% vc,]    fct X 1    a 2 3    c 3 5    c 5 7    a 7 9    c 9 10   a 1 12   c 2 14   c 4 

You could also use ?is.element:

dt[is.element(dt$fct, vc),] 
like image 117
johannes Avatar answered Sep 20 '22 14:09

johannes


Similar to above, using filter from dplyr:

filter(df, fct %in% vc) 
like image 27
Andrew Haynes Avatar answered Sep 20 '22 14:09

Andrew Haynes