Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select subset of dataframe by non-unique ids

Tags:

r

selection

Suppose I have a dataframe like this one:

df <- data.frame (id = c("a", "b", "a", "c", "e", "d", "e"), n=1:7)

and a vector with ids like this one:

v <- c("a", "b")

How can I select the rows of the dataframe that match the ids in v? I can't use the id column for rownames because they are not unique. When I try that, I get:

 rownames(df) <- df[["id"]]
Error in `row.names<-.data.frame`(`*tmp*`, value = c(1L, 2L, 1L, 3L, 5L,  : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘a’, ‘e’ 
like image 437
amarillion Avatar asked Apr 02 '10 19:04

amarillion


2 Answers

Use

df[df$id %in% v,]
like image 120
Shane Avatar answered Oct 11 '22 22:10

Shane


This should do what you want:

ndx = which(df$id %in% v)
df[ndx,]
like image 20
doug Avatar answered Oct 12 '22 00:10

doug