Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot2: geom_smooth confidence band does not extend to edge of graph, even with fullrange=TRUE

Tags:

r

ggplot2

I've been working on generating a few scatterplots in ggplot2 and found that my geom_smooth se shade (same exact issue with stat_smooth) won't extend the full range of my plot (see plot image), which is driving me crazy.

You can see from the code that I have used "fullrange = TRUE" and it does extend the line itself (and the se shade on my other fit line), but for whatever reason stunts the shade on one of my fit lines.

It seems to be an issue with it conflicting with the upper boundary of the plot. If I extend the range to the point that the line hits the right boundary instead, the shade continues with no issues, but doing that is not an option because I have to double the x and y axis ranges to make that happen, which squashes my data.

Does anyone have any idea how to get the shade to extend all the way to the upper axis boundary?

code to produce plot

enter image description here

like image 507
J Ross Avatar asked Feb 11 '16 15:02

J Ross


1 Answers

You probably need to add coord_cartesian in addition to scale_x/y_continuous. scale_x/y_continuous removes points that are outside the range of the graph, but coord_cartesian overrides this and uses all of the data, even if some of it is not visible in the graph. In your plot, the confidence band for the red points ends where the top of the band exceeds the y-range of the graph.

There's no actual "data" in the extended range of your graph, but geom_smooth treats the points it generates for plotting the confidence bands as "data" for the purposes of deciding what to plot.

Take a look at the examples below. The first plot uses only scale_x/y_continuous. The second adds coord_cartesian, but note that the confidence bands are still not plotted. In the third plot, we still use coord_cartesian, but we expand the scale_y_continuous range downward so that points in the confidence band below zero are included in the y-range. However, coord_cartesian is what determines the range that's actually plotted and also prevents points outside the range from being excluded.

I actually find this behavior confusing. I would have thought that you could just use coord_cartesian alone with the desired x and y ranges and still have the confidence bands and regression lines plotted all the way to the edges of the graph. In any case, hopefully this will get you what you're looking for.

p1 = ggplot(mtcars, aes(wt, mpg, colour=factor(am))) + 
  geom_smooth(fullrange=TRUE, method="lm") +
  scale_x_continuous(expand=c(0,0), limits=c(0,10)) +
  scale_y_continuous(expand=c(0,0), limits=c(0,100)) +
  ggtitle("scale_x/y_continuous")

p2 = ggplot(mtcars, aes(wt, mpg, colour=factor(am))) + 
  geom_smooth(fullrange=TRUE, method="lm") +
  scale_x_continuous(expand=c(0,0), limits=c(0,10)) +
  scale_y_continuous(expand=c(0,0), limits=c(0,100)) +
  coord_cartesian(xlim=c(0,10), ylim=c(0,100)) +
  ggtitle("Add coord_cartesian; same y-range")

p3 = ggplot(mtcars, aes(wt, mpg, colour=factor(am))) +
  geom_smooth(fullrange=TRUE, method="lm") +
  scale_x_continuous(expand=c(0,0), limits=c(0,10)) +
  scale_y_continuous(expand=c(0,0), limits=c(-50,100)) +
  coord_cartesian(xlim=c(0,10), ylim=c(0,100)) +
  ggtitle("Add coord_cartesian; expanded y-range")

gridExtra::grid.arrange(p1, p2, p3)

enter image description here

like image 117
eipi10 Avatar answered Oct 22 '22 10:10

eipi10