I have a data frame df
with an ID column eg A
,B
,etc. I also have a vector containing certain IDs:
L <- c("A", "B", "E")
How can I filter the data frame to get only the IDs present in the vector? Individually, I would use
subset(df, ID == "A")
but how do I filter on a whole vector?
How to apply a filter on dataframe in R ? A filter () function is used to filter out specified elements from a dataframe that return TRUE value for the given condition (s). filter () helps to reduce a huge dataset into small chunks of datasets.
The second method to find and remove duplicated columns in R is by using the duplicated() function and the t() function. This method is similar to the previous method. However, instead of creating a list, it transposes the data frame before applying the duplicated() function.
You can use the %in%
operator:
> df <- data.frame(id=c(LETTERS, LETTERS), x=1:52) > L <- c("A","B","E") > subset(df, id %in% L) id x 1 A 1 2 B 2 5 E 5 27 A 27 28 B 28 31 E 31
If your IDs are unique, you can use match()
:
> df <- data.frame(id=c(LETTERS), x=1:26) > df[match(L, df$id), ] id x 1 A 1 2 B 2 5 E 5
or make them the rownames of your dataframe and extract by row:
> rownames(df) <- df$id > df[L, ] id x A A 1 B B 2 E E 5
Finally, for more advanced users, and if speed is a concern, I'd recommend looking into the data.table
package.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With