Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding boxplot below density plot

Tags:

r

ggplot2

I'm new to ggplot and I'm trying to create this graph:

enter image description here

But actually, I'm just stuck here:

enter image description here

This is my code :

ggplot(diamonds)  + 
  aes(x = carat, group = cut) + 
  geom_line(stat = "density", size = 1) + 
  theme_grey() + 
  facet_wrap(~cut, nrow = 5, strip.position = "right") + 
  geom_boxplot(aes())

Does someone know what I can do next?

like image 213
Benco016 Avatar asked Jan 27 '19 23:01

Benco016


People also ask

What can you do with a ggplot2 density plot?

With the default formatting of ggplot2 for things like the gridlines, fonts, and background color, this just looks more presentable right out of the box. Ok. Now that we have the basic ggplot2 density plot, let's take a look at a few variations of the density plot. There are a few things we can do with the density plot. We can add some color.

How to add points to a base R box plot?

... Adding points ( strip charts) to a base R box plot can be achieved making use of the stripchart function.

What do you need to know about a density plot?

You need to see what's in your data. You need to find out if there is anything unusual about your data. These basic data inspection tasks are a perfect use case for the density plot.

Can I make a 2-dimensional density plot in R?

Beyond just making a 1-dimensional density plot in R, we can make a 2-dimensional density plot in R. Be forewarned: this is one piece of ggplot2 syntax that is a little "un-intuitive." Syntactically, this is a little more complicated than a typical ggplot2 chart, so let's quickly walk through it.


1 Answers

Edit: As of ggplot2 3.3.0, this can be done in ggplot2 without any extension package.

Under the package's news, under new features:

All geoms and stats that had a direction (i.e. where the x and y axes had different interpretation), can now freely choose their direction, instead of relying on coord_flip(). The direction is deduced from the aesthetic mapping, but can also be specified directly with the new orientation argument (@thomasp85, #3506).

The following will now work directly (replacing all references to geom_boxploth / stat_boxploth in the original answer with geom_boxplot / stat_boxplot:

library(ggplot2)

ggplot(diamonds, aes(x = carat, y = -0.5)) +

  # horizontal boxplots & density plots
  geom_boxplot(aes(fill = cut)) +
  geom_density(aes(x = carat), inherit.aes = FALSE) +
  
  # vertical lines at Q1 / Q2 / Q3
  stat_boxplot(geom = "vline", aes(xintercept = ..xlower..)) +
  stat_boxplot(geom = "vline", aes(xintercept = ..xmiddle..)) +
  stat_boxplot(geom = "vline", aes(xintercept = ..xupper..)) +
  
  facet_grid(cut ~ .) +
  scale_fill_discrete()

Original answer

This can be done easily with a horizontal boxplot geom_boxploth() / stat_boxploth(), found in the ggstance package:

library(ggstance)

ggplot(diamonds, aes(x = carat, y = -0.5)) +

  # horizontal box plot
  geom_boxploth(aes(fill = cut)) +

  # normal density plot
  geom_density(aes(x = carat), inherit.aes = FALSE) +

  # vertical lines at Q1 / Q2 / Q3
  stat_boxploth(geom = "vline", aes(xintercept = ..xlower..)) +
  stat_boxploth(geom = "vline", aes(xintercept = ..xmiddle..)) +
  stat_boxploth(geom = "vline", aes(xintercept = ..xupper..)) +
  
  facet_grid(cut ~ .) +
  
  # reproduce original chart's color scale (o/w ordered factors will result
  # in viridis scale by default, using the current version of ggplot2)
  scale_fill_discrete()

plot

If you are limited to the ggplot2 package for one reason or another, it can still be done, but it would be less straightforward, since geom_boxplot() and geom_density() go in different directions.

Alternative 1: calculate the box plot's coordinates, & flip them manually before passing the results to ggplot(). Add a density layer in the normal way:

library(dplyr)
library(tidyr)

p.box <- ggplot(diamonds, aes(x = cut, y = carat)) + geom_boxplot()    
p.box.data <- layer_data(p.box) %>%
  select(x, ymin, lower, middle, upper, ymax, outliers) %>%
  mutate(cut = factor(x, labels = levels(diamonds$cut), ordered = TRUE)) %>%
  select(-x)

ggplot(p.box.data) +
  
  # manually plot flipped boxplot
  geom_segment(aes(x = ymin, xend = ymax, y = -0.5, yend = -0.5)) +
  geom_rect(aes(xmin = lower, xmax = upper, ymin = -0.75, ymax = -0.25, fill = cut),
            color = "black") +
  geom_point(data = . %>% unnest(outliers),
             aes(x = outliers, y = -0.5)) +
  
  # vertical lines at Q1 / Q2 / Q3
  geom_vline(data = . %>% select(cut, lower, middle, upper) %>% gather(key, value, -cut),
             aes(xintercept = value)) +
  
  # density plot
  geom_density(data = diamonds, aes(x = carat)) +

  facet_grid(cut ~ .) +
  labs(x = "carat") +
  scale_fill_discrete()

Alternative 2: calculate the density plot's coordinates, & flip them manually before passing the results to ggplot(). Add a box plot layer in the normal way. Flip the whole chart:

p.density <- ggplot(diamonds, aes(x = carat, group = cut)) + geom_density()    
p.density.data <- layer_data(p.density) %>%
  select(x, y, group) %>%
  mutate(cut = factor(group, labels = levels(diamonds$cut), ordered = TRUE)) %>%
  select(-group)
p.density.data <- p.density.data %>%
  rbind(p.density.data %>% 
          group_by(cut) %>% 
          filter(x == min(x)) %>% 
          mutate(y = 0) %>% 
          ungroup())

ggplot(diamonds, aes(x = -0.5, y = carat)) +

  # manually flipped density plot
  geom_polygon(data = p.density.data, aes(x = y, y = x), 
               fill = NA, color = "black") +

  # box plot
  geom_boxplot(aes(fill = cut, group = cut)) +

  # vertical lines at Q1 / Q2 / Q3
  stat_boxplot(geom = "hline", aes(yintercept = ..lower..)) +
  stat_boxplot(geom = "hline", aes(yintercept = ..middle..)) +
  stat_boxplot(geom = "hline", aes(yintercept = ..upper..)) +

  facet_grid(cut ~ .) +
  scale_fill_discrete() +
  coord_flip()
like image 138
Z.Lin Avatar answered Sep 27 '22 17:09

Z.Lin