Given the following mock data:
set.seed(123)
x <- data.frame(let = sample(letters[1:5], 100, replace = T),
num = sample(1:10, 100, replace = T))
y <- subset(x, let != 'a')
Creating a table of y$let
yields
a b c d e
0 20 21 22 18
But I don't want a
to show anymore. If I try to do this:
levels(y$let) <- factor(y$let)
I mess the frequencies, since now table(y$let)
gives me
b d c e
0 20 21 40
I'm aware I could do xtabs(~ y$let, drop.unused.levels = T)
and work around the problem, but it doesn't reset the variable levels at its core (which is important to me, since this is an early change I'm making to the dataset which will carry on throughout the whole analysis). Moreover, xtabs
is a different class from table
, which will give me headaches later in the project.
The question is: how can I automatically change levels(y$let)
so it doesn't show levels that were dropped when I created the subset? In this case, how can I make it show [1] "b" "c" "d" "e"
?
Removing Levels from a Factor in R Programming – droplevels() Function. droplevels() function in R programming used to remove unused levels from a Factor. droplevels(x, exclude = if(anyNA(levels(x))) NULL else NA, …)
How do I Rename Factor Levels in R? The simplest way to rename multiple factor levels is to use the levels() function. For example, to recode the factor levels “A”, “B”, and “C” you can use the following code: levels(your_df$Category1) <- c("Factor 1", "Factor 2", "Factor 3") .
Subset a Data Frame with Base R Extract[] To specify a logical expression for the rows parameter, use the standard R operators. If subsetting is done by only rows or only columns, then leave the other value blank. For example, to subset the d data frame only by rows, the general form reduces to d[rows,] .
There's a recently added function in R for this:
y <- droplevels(y)
Just do y$let <- factor(y$let)
. Running factor
on an existing factor variable will reset the levels to only those that are present.
Adding to Hong Ooi's answer, here is an example I found from R-Bloggers.
# Create some fake data
x <- as.factor(sample(head(colors()),100,replace=TRUE))
levels(x)
x <- x[x!="aliceblue"]
levels(x) # still the same levels
table(x) # even though one level has 0 entries!
The solution is simple: run factor() again:
x <- factor(x)
levels(x)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With