Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filtering a data frame on a vector [duplicate]

I have a data frame df with an ID column eg A,B,etc. I also have a vector containing certain IDs:

L <- c("A", "B", "E") 

How can I filter the data frame to get only the IDs present in the vector? Individually, I would use

subset(df, ID == "A") 

but how do I filter on a whole vector?

like image 388
adam.888 Avatar asked Feb 19 '12 14:02

adam.888


People also ask

Can you use filter on a data frame in R?

How to apply a filter on dataframe in R ? A filter () function is used to filter out specified elements from a dataframe that return TRUE value for the given condition (s). filter () helps to reduce a huge dataset into small chunks of datasets.

How do I find duplicate columns in R?

The second method to find and remove duplicated columns in R is by using the duplicated() function and the t() function. This method is similar to the previous method. However, instead of creating a list, it transposes the data frame before applying the duplicated() function.


1 Answers

You can use the %in% operator:

> df <- data.frame(id=c(LETTERS, LETTERS), x=1:52) > L <- c("A","B","E") > subset(df, id %in% L)    id  x 1   A  1 2   B  2 5   E  5 27  A 27 28  B 28 31  E 31 

If your IDs are unique, you can use match():

> df <- data.frame(id=c(LETTERS), x=1:26) > df[match(L, df$id), ]   id x 1  A 1 2  B 2 5  E 5 

or make them the rownames of your dataframe and extract by row:

> rownames(df) <- df$id > df[L, ]   id x A  A 1 B  B 2 E  E 5 

Finally, for more advanced users, and if speed is a concern, I'd recommend looking into the data.table package.

like image 146
flodel Avatar answered Oct 02 '22 11:10

flodel