Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unintended line across X axis of density plot (r)

I am trying to identify why I have a purple line appearing along the x axis that is the same color as "Prypchan, Lida" from my legend. I took a look at the data and do not see any issues there.

ggplot(LosDoc_Ex, aes(x = LOS)) +
  geom_density(aes(colour = AttMD)) +
  theme(legend.position = "bottom") +
  xlab("Length of Stay") +
  ylab("Distribution") +
  labs(title = "LOS Analysis * ",
       caption = "*exluding Residential and WSH",
       color = "Attending MD: ")

LOS Analysis by Doc

like image 834
Luke Baker Avatar asked Jan 28 '23 09:01

Luke Baker


1 Answers

Usually I'd wait for a reproducible example, but in this case, I'd say the underlying explanation is really quite straightforward:

geom_density() creates a polygon, not a line.

Using a sample dataset from ggplot2's own package, we can observe the same straight line below the density plots, covering the x-axis & y-axis. The colour of the line simply depends on which plot is on top of the rest:

p <- ggplot(diamonds, aes(carat, colour = cut)) +
  geom_density()

plot

Workaround 1: You can manually calculate the density values yourself for each colour group in a new data frame, & plot the results using geom_line() instead of geom_density():

library(dplyr)
library(tidyr)
library(purrr)
diamonds2 <- diamonds %>%
  nest(-cut) %>%
  mutate(density = map(data, ~density(.x$carat))) %>%
  mutate(density.x = map(density, ~.x[["x"]]),
         density.y = map(density, ~.x[["y"]])) %>%
  select(cut, density.x, density.y) %>%
  unnest()

ggplot(diamonds2, aes(x = density.x, y = density.y, colour = cut)) +
  geom_line()

plot with new data frame

Workaround 2: Or you can take the data generated by the original plot, & plot that using geom_line(). The colours would need to be remapped to the legend values though:

lp <- layer_data(p)
if(is.factor(diamonds$cut)) {
  col.lev = levels(diamonds$cut) 
} else {
  col.lev = sort(unique(diamonds$cut))
}
lp$cut <- factor(lp$group, labels = col.lev)

ggplot(lp, aes(x = x, y = ymax, colour = cut)) +
  geom_line()

plot with data frame from original plot

like image 138
Z.Lin Avatar answered Jan 30 '23 23:01

Z.Lin