Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subset a data frame using OR when the column contains a factor

I would like to make a subset of a data frame in R that is based on one OR another value in a column of factors but it seems I cannot use | with factor values.

Example:

# fake data
x <- sample(1:100, 9)
nm <- c("a", "a", "a", "b", "b", "b", "c", "c", "c")
fake <- cbind(as.data.frame(nm), as.data.frame(x))
# subset fake to only rows with name equal to a or b
fake.trunk <- fake[fake$nm == "a" | "b", ]

produces the error:

Error in fake$nm == "a" | "b" : 
operations are possible only for numeric, logical or complex types

How can I accomplish this?

Obviously my actual data frame has more than 3 values in the factor column so just using != "c" won't work.

like image 279
DQdlM Avatar asked Apr 15 '11 18:04

DQdlM


2 Answers

You need fake.trunk <- fake[fake$nm == "a" | fake$nm == "b", ]. A more concise way of writing that (especially with more than two conditions) is:

fake[ fake$nm %in% c("a","b"), ]
like image 112
Joshua Ulrich Avatar answered Oct 24 '22 13:10

Joshua Ulrich


Another approach would be to use subset() and write

fake.trunk = subset(fake, nm %in% c('a', 'b'))
like image 43
Ramnath Avatar answered Oct 24 '22 13:10

Ramnath