I want to create a heat map using ggplot however I want to order the y-axis by the number of observations. I order the dataframe by the column N and add the number of observations to the group name so that it appears in the axis label. When I plot the data it re-orders based on the group name. Is there a way to set factor levels based on the order they appear in the data frame?
Some data:
library(dplyr)
library(tidyr)
library(ggplot2)
school <- c("School A", "SChool B", "School C", "School D", "School E", "School F")
N <- c(25,28,12,22,30,25)
var1 <- c(1,0,1,1,0,1)
var2 <- c(0,0,0,1,0,1)
var3 <- c(0,1,0,1,1,1)
df <- tbl_df (data.frame (school, N, var1, var2, var3))
df <- arrange (df, N) %>%
gather (variable, value, var1:var3)
df$school <- paste0 (df$school, " (", df$N, ")")
df <- select (df, school, variable, value)
ggplot(df, aes(variable, school)) + geom_tile(aes(fill = value), colour = "white") +
scale_fill_gradient(low = "white",high = "steelblue")
Ultimately I want the order of schools to be:
School C (12)
School D (22)
School A (25)
School F (25)
School B (28)
School E (30)
As I want to do this for multiple plots I want to find a way to do this automatically and not have to re-set factor levels each time.
One way to change the level order is to use factor() on the factor and specify the order directly. In this example, the function ordered() could be used instead of factor() . Another way to change the order is to use relevel() to make a particular level first in the list.
To sort a numerical factor column in an R data frame, we would need to column with as. character then as. numeric function and then order function will be used.
How do I Rename Factor Levels in R? The simplest way to rename multiple factor levels is to use the levels() function. For example, to recode the factor levels “A”, “B”, and “C” you can use the following code: levels(your_df$Category1) <- c("Factor 1", "Factor 2", "Factor 3") .
R – Level Ordering of Factors They represent columns as they have a limited number of unique values. Factors in R can be created using factor() function. It takes a vector as input. c() function is used to create a vector with explicitly provided values.
One way around this is to change your ggplot
call to
ggplot(df, aes(variable, factor(school, levels = unique(school)))) + ...
To avoid typing this every time, you can create a function
f <- function(x) factor(x, levels = unique(x))
and then call it by ggplot(df, aes(variable, f(school))) + ...
Note that this will place the first level of the factor at the bottom of the plot. If you want it at the top, you need to change f
to function(x) factor(x, levels = rev(unique(x)))
Add the following forcats
pipe to the code just before the call to ggplot()
.
library(forcats)
df$school <- fct_inorder(df$school) %>% fct_rev()
fct_inorder()
creates factor levels in data frame order and fct_rev()
reverses them so the plot goes in the right direction.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With