Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I prioritize the y scale I get from geom_histogram?

Tags:

r

ggplot2

I would like to draw a vertical line on a histogram where the median of my data are after faceting. I'd like to do this with stat_summary as shown below. The problem with this approach is that the y axis is not on the right scale

I think this is because I call ggplot(aes(x=data, y=data)). Seems a bit strange to do for a histogram, but the reason I do this is because stat_summary requires a y aesthetic. Is there a way I could plot the medians with stat_summary but keep the scale I get from when I call geom_histogram? I could add a ylim but the problem with that approach is that I may not know what the right upper limit is a priori for new data I might see.

Below is a reprex to generate my example.

library(tidyverse)
#> Warning: package 'tibble' was built under R version 3.6.2

d = tribble(
  ~groupvar, ~data,
  'a', rlnorm(10,2, 0.5),
  'b', rlnorm(10,2, 0.5),
  'c', rlnorm(10,5, 0.5)
) %>% unnest(c(data))



d %>% 
  ggplot(aes(x = data, y = data, group = groupvar))+
  geom_histogram(aes(y = ..count..))+
  facet_grid(~groupvar)+
  stat_summary(aes(x=0, xintercept=stat(y)), fun = median, geom = 'vline')
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Created on 2020-04-01 by the reprex package (v0.3.0)

like image 460
Demetri Pananos Avatar asked Apr 01 '20 05:04

Demetri Pananos


1 Answers

You can delete y = ..count.. and map y only in the stat_summary layer:

d %>% 
  ggplot(aes(x = data, group = groupvar))+
  geom_histogram() +
  facet_grid(~groupvar) + 
  stat_summary(aes(y = 0, xintercept=stat(x)), fun = median, geom = 'vline')

enter image description here

I also (a) removed the x = 0 in the stat_summary layer and (b) changed stat(y) to stat(x) (since you're plotting a vertical line, makes sense for it to be a summary of the x values).

like image 157
Gregor Thomas Avatar answered Oct 11 '22 17:10

Gregor Thomas