Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reorder factor levels in a tidy way?

Hi I usually use some code like the following to reorder bars in ggplot or other types of plots.

Normal plot (unordered)

library(tidyverse)
iris.tr <-iris %>% group_by(Species) %>% mutate(mSW = mean(Sepal.Width)) %>%
  select(mSW,Species) %>% 
  distinct()
ggplot(iris.tr,aes(x = Species,y = mSW, color = Species)) +
  geom_point(stat = "identity")

Ordering the factor + ordered plot

iris.tr$Species <- factor(iris.tr$Species,
                          levels = iris.tr[order(iris.tr$mSW),]$Species,
                          ordered = TRUE)
ggplot(iris.tr,aes(x = Species,y = mSW, color = Species)) + 
  geom_point(stat = "identity")

The factor line is extremely unpleasant to me and I wonder why arrange() or some other function can't simplify this. I am missing something?

Note:

This do not work but I would like to know if something like this exists in the tidyverse.

iris.tr <-iris %>% group_by(Species) %>% mutate(mSW = mean(Sepal.Width)) %>%
  select(mSW,Species) %>% 
  distinct() %>% 
  arrange(mSW)
ggplot(iris.tr,aes(x = Species,y = mSW, color = Species)) + 
  geom_point(stat = "identity")
like image 613
David Mas Avatar asked Jul 17 '17 16:07

David Mas


People also ask

How do you change the order of levels in a factor?

One way to change the level order is to use factor() on the factor and specify the order directly. In this example, the function ordered() could be used instead of factor() . Another way to change the order is to use relevel() to make a particular level first in the list.

How do you reorder the levels of a variable in R?

Using factor() function to reorder factor levels is the simplest way to reorder the levels of the factors, as here the user needs to call the factor function with the factor level stored and the sequence of the new levels which is needed to replace from the previous factor levels as the functions parameters and this ...

How do you factor Relevel in R?

To specify the manual reference factor level in the R Language, we will use the relevel() function. The relevel() function is used to reorder the factor vector so that the level specified by the user is first and others are moved down.

What is Fct_relevel?

We can use the function fct_relevel() when we need to manually reorder our factor levels. In addition to the factor, you give it a character vector of level names, and specify where you want to move them. It defaults to moving them to the front, but you can move them after another level with the argument after .


Video Answer


2 Answers

Using ‹forcats›:

iris.tr %>%
    mutate(Species = fct_reorder(Species, mSW)) %>%
    ggplot() +
    aes(Species, mSW, color = Species) +
    geom_point()
like image 71
Konrad Rudolph Avatar answered Oct 05 '22 03:10

Konrad Rudolph


Reordering the factor using base:

iris.ba = iris
iris.ba$Species = with(iris.ba, reorder(Species, Sepal.Width, mean))

Translating to dplyr:

iris.tr = iris %>% mutate(Species = reorder(Species, Sepal.Width, mean))

After that, you can continue on to summarize and plot as in your question.


A couple comments: reordering a factor is modifying a data column. The dplyr command to modify a data column is mutate. All arrange does is re-order rows, this has no effect on the levels of the factor and hence no effect on the order of a legend or axis in ggplot.

All factors have an order for their levels. The difference between an ordered = TRUE factor and a regular factor is how the contrasts are set up in a model. ordered = TRUE should only be used if your factor levels have a meaningful rank order, like "Low", "Medium", "High", and even then it only matters if you are building a model and don't want the default contrasts comparing everything to a reference level.

like image 27
Gregor Thomas Avatar answered Oct 05 '22 01:10

Gregor Thomas