I have a dataset that contains the following variables and similar values:
COBSDATE, CITY, RESPONSE_TIME
2011-11-23 A 1.1
2011-11-23 A 1.5
2011-11-23 A 1.2
2011-11-23 B 2.3
2011-11-23 B 2.1
2011-11-23 B 1.8
2011-11-23 C 1.4
2011-11-23 C 6.1
2011-11-23 A 3.1
2011-11-23 A 1.1
I have successfully created a graph that displays all of the response_time values and a smooth geometry to further describe some of the variation.
The challenge that I have is that I want a better view of the smoothed value, and one of the cities has frequent 'outliers'. I can control this by adding ylim(0,p99) to the plot, but this then causes the smooth to only be calculated on the subset of data.
Is there a way to use all of this data for the smoothed plot and the only the subset for the jitter plot?
My code here (both are the same except for the + ylim(0,20)
:
truncated -
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) +
geom_jitter(colour=alpha("#007DB1", 1/8)) +
geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) +
ylim(0,20) +
facet_wrap(~CITY)
Whole data set -
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) +
geom_jitter(colour=alpha("#007DB1", 1/8)) +
geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) +
facet_wrap(~CITY)
If you just want to "zoom in", you can use coord_cartesian
:
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) +
geom_jitter(colour=alpha("#007DB1", 1/8)) +
geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) +
coord_cartesian(ylim=c(0,20)) +
facet_wrap(~CITY)
If you want to use a subset of the data for the jitter geom, then override the data inheritance:
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) +
geom_jitter(data=subset(dataRaw, RESPONSE_TIME>=0 & RESPONSE_TIME<=20),
colour=alpha("#007DB1", 1/8)) +
geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) +
ylim(0,20) +
facet_wrap(~CITY)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With