Subset a dataframe by multiple factor levels [duplicate]

Question

How can I avoid using a loop to subset a dataframe based on multiple factor levels?

In the following example my desired output is a dataframe. The dataframe should contain the rows of the original dataframe where the value in "Code" equals one of the values in "selected".

Working example:

#sample data
Code<-c("A","B","C","D","C","D","A","A")
Value<-c(1, 2, 3, 4, 1, 2, 3, 4)
data<-data.frame(cbind(Code, Value))

selected<-c("A","B") #want rows that contain A and B

#Begin subsetting
result<-data[which(data$Code==selected[1]),]
s1<-2
while(s1<length(selected)+1)
{
  result<-rbind(result,data[which(data$Code==selected[s1]),])
  s1<-s1+1
}

This is a toy example of a much larger dataset, so "selected" may contain a great number of elements and the data a great number of rows. Therefore I would like to avoid the loop.

Metrics · Accepted Answer

You can use %in%

  data[data$Code %in% selected,]
  Code Value
1    A     1
2    B     2
7    A     3
8    A     4

Joe · Answer

Here's another:

data[data$Code == "A" | data$Code == "B", ]

It's also worth mentioning that the subsetting factor doesn't have to be part of the data frame if it matches the data frame rows in length and order. In this case we made our data frame from this factor anyway. So,

data[Code == "A" | Code == "B", ]

also works, which is one of the really useful things about R.

Jilber Urbina · Answer

Try this:

> data[match(as.character(data$Code), selected, nomatch = FALSE), ]
    Code Value
1      A     1
2      B     2
1.1    A     1
1.2    A     1

Subset a dataframe by multiple factor levels [duplicate]

Tags:

r

subset

Walter

3 Answers

Metrics

Joe

Jilber Urbina

Recent Activity

Donate For Us

Subset a dataframe by multiple factor levels [duplicate]

Tags:

r

subset

Walter

3 Answers

Metrics

Joe

Jilber Urbina

Related questions

Recent Activity

Donate For Us