Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R/ggplot2: smooth on entire dataset while enforcing a ylim cap

UPDATE: I found the answer... included it below.

I have a dataset that contains the following variables and similar values:

COBSDATE,   CITY, RESPONSE_TIME
2011-11-23  A     1.1
2011-11-23  A     1.5
2011-11-23  A     1.2
2011-11-23  B     2.3
2011-11-23  B     2.1
2011-11-23  B     1.8
2011-11-23  C     1.4
2011-11-23  C     6.1
2011-11-23  A     3.1
2011-11-23  A     1.1

I have successfully created a graph that displays all of the response_time values and a smooth geometry to further describe some of the variation.

The challenge that I have is that I want a better view of the smoothed value, and one of the cities has frequent 'outliers'. I can control this by adding ylim(0,p99) to the plot, but this then causes the smooth to only be calculated on the subset of data.

Is there a way to use all of this data for the smoothed plot and the only the subset for the jitter plot?

My code here (both are the same except for the + ylim(0,20): truncated -

ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + 
    geom_jitter(colour=alpha("#007DB1", 1/8)) + 
    geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + 
    ylim(0,20) + 
    facet_wrap(~CITY)

Whole data set -

ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + 
    geom_jitter(colour=alpha("#007DB1", 1/8)) + 
    geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + 
    facet_wrap(~CITY)
like image 529
BenH Avatar asked Feb 29 '12 19:02

BenH


1 Answers

If you just want to "zoom in", you can use coord_cartesian:

ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + 
  geom_jitter(colour=alpha("#007DB1", 1/8)) + 
  geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + 
  coord_cartesian(ylim=c(0,20)) + 
  facet_wrap(~CITY)

If you want to use a subset of the data for the jitter geom, then override the data inheritance:

ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + 
  geom_jitter(data=subset(dataRaw, RESPONSE_TIME>=0 & RESPONSE_TIME<=20), 
              colour=alpha("#007DB1", 1/8)) + 
  geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + 
  ylim(0,20) + 
  facet_wrap(~CITY)
like image 61
Dan M. Avatar answered Oct 03 '22 22:10

Dan M.