I'm trying to filter out a bunch of data using the filter
command from the dplyr
package. Everything appears to be going exactly as I would hope, but when I try to draw some charts off of the new filtered data, all of the levels that I filtered out are showing up (albeit with no values). But the fact that they are there is still throwing off my horizontal axis.
So two questions:
1) Why are these filtered levels still in the data?
2) How do I filter to make these no longer present?
Here is a small example you can run to see what I am talking about:
library(dplyr)
library(ggvis)
# small example frame
data <- data.frame(
x = c(1:10),
y = rep(c("yes", "no"), 5)
)
# filtering to only include data with "yes" in y variable
new_data <- data %>%
filter(y == "yes")
levels(new_data) ## Why is "no" showing up as a level for this if I've filtered that out?
# Illustration of the filtered values still showing up on axis
new_data %>%
ggvis(~y, ~x) %>%
layer_bars()
Thanks for any help.
Factors in R do not automatically drop levels when filtered. You may think this is a silly default (I do), but it's easy to deal with -- just use the droplevels
function on the result.
new_data <- data %>%
filter(y == "yes") %>%
droplevels
levels(new_data$y)
## [1] "yes"
If you did this all the time you could define a new function
dfilter <- function(...) droplevels(filter(...))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With