Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ghost factor levels in R [duplicate]

Tags:

r

r-factor

subset

Possible Duplicate:
dropping factor levels in a subsetted data frame in R

I have subsetted away observations with a certain factor level. When checking whether this has been done with summary() the levels were still listed, but with zero observations. Shouldn't they disappear during the subsetting?

like image 594
ego_ Avatar asked Sep 20 '12 13:09

ego_


2 Answers

To make the extra levels disappear, use drop=TRUE when subsetting:

newfactor <- oldfactor[indices, drop=TRUE]

Incidentally, one reason this is not the default is that factors with different levels cannot be compared. So if you want to compare your factors with the original vector, or perhaps a different subset of the vector, you'd need to keep the extra levels.

like image 192
David Robinson Avatar answered Nov 07 '22 21:11

David Robinson


Subsetting doesn't drop empty levels. Why this is the case is that it is a feature. Think of it as your factor levels determine the possible/potential categories of a thing. If you only take a subset of these things, the possible categories of thing don't change, your subset just doesn't contain any of them.

If you want to drop these empty levels, see ?droplevels.

like image 26
Gavin Simpson Avatar answered Nov 07 '22 19:11

Gavin Simpson