Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Constraining stat_smooth to a particular range

Tags:

r

ggplot2

I would like to calculate two different lines of best fit for 2 parts of my plot. I could subset the data, but was wondering whether it's possible to define a range over which stat_smooth will operate.

For instance, I would like to fit two separate lines to this data, one for lat<100 and one for lat>100.

test<-data.frame(ecdf=c(0.02040816,0.04081633,0.06122449,0.08163265,0.10204082,0.14285714,0.14285714,0.16326531,0.24489796,0.24489796,0.24489796,0.24489796,0.26530612,0.28571429,0.30612245,0.32653061,0.36734694,0.36734694,0.38775510,0.40816327,0.42857143,0.46938776,0.46938776,0.48979592,0.53061224,0.53061224,0.59183673,0.59183673,0.59183673,0.61224490,0.63265306,0.65306122,0.67346939,0.69387755,0.71428571,0.73469388,0.75510204,0.77551020,0.79591837,0.81632653,0.83673469,0.85714286,0.87755102,0.89795918,0.91836735,0.93877551,0.95918367,0.97959184,0.99900000),lat=c(50.7812,66.4062,70.3125,97.6562,101.5620,105.4690,105.4690,109.3750,113.2810,113.2810,113.2810,113.2810,125.0000,136.7190,148.4380,164.0620,167.9690,167.9690,171.8750,175.7810,183.5940,187.5000,187.5000,191.4060,195.3120,195.3120,234.3750,234.3750,234.3750,238.2810,261.7190,312.5000,316.4060,324.2190,417.9690,507.8120,511.7190,562.5000,664.0620,683.5940,957.0310,1023.4400,1050.7800,1070.3100,1109.3800,1484.3800,1574.2200,1593.7500,1750.0000))

p <- ggplot( test, aes(lat, ecdf) ) 
p+geom_point()+scale_y_probit()+scale_x_log10()+ stat_smooth(method = "lm")

plot

like image 742
FGiorlando Avatar asked Dec 10 '11 22:12

FGiorlando


People also ask

What does Stat_smooth in R do?

stat_smooth: Add a smoother. Aids the eye in seeing patterns in the presence of overplotting.

What does the SE argument to Geom_smooth ()` do?

se Display confidence interval around smooth (TRUE by default, see level to control.)

What does Geom_smooth mean?

The geom smooth function is a function for the ggplot2 visualization package in R. Essentially, geom_smooth() adds a trend line over an existing plot.

What does Geom_smooth () using formula YX mean?

The warning geom_smooth() using formula 'y ~ x' is not an error. Since you did not supply a formula for the fit, geom_smooth assumed y ~ x, which is just a linear relationship between x and y. You can avoid this warning by using geom_smooth(formula = y ~ x, method = "lm")


1 Answers

You could always do something like this:

p + geom_point() + scale_y_probit() + scale_x_log10() +
geom_smooth(data=subset(test, lat>100), method = "lm") +
geom_smooth(data=subset(test, lat<=100), method = "lm")

But it's probably preferable to first define a factor (here latcat) that marks the two groups of points you want to separately smooth. Then, include the factor in your aesthetic, and geom_smooth() will do the rest of the work for you:

test$latcat <- cut(test$lat, 
                   breaks = c(-Inf, 100, Inf), 
                   labels = c("<=100", ">100"))

p <- ggplot( test, aes(lat, ecdf, colour = latcat))
p + geom_point() + scale_y_probit() + scale_x_log10() +
geom_smooth(method = "lm")

enter image description here

like image 165
Josh O'Brien Avatar answered Sep 30 '22 09:09

Josh O'Brien