While looking at this issue, I couldn't specify a custom linear model to geom_smooth
. My code is as follows:
example.label <- c("A","A","A","A","A","B","B","B","B","B")
example.value <- c(5, 4, 4, 5, 3, 8, 9, 11, 10, 9)
example.age <- c(30, 40, 50, 60, 70, 30, 40, 50, 60, 70)
example.score <- c(90,95,89,91,85,83,88,94,83,90)
example.data <- data.frame(example.label, example.value,example.age,example.score)
p = ggplot(example.data, aes(x=example.age,
y=example.value,color=example.label)) +
geom_point()
#geom_smooth(method = lm)
cf = function(dt){
lm(example.value ~example.age+example.score, data = dt)
}
cf(example.data)
p_smooth <- by(example.data, example.data$example.label,
function(x) geom_smooth(data=x, method = lm, formula = cf(x)))
p + p_smooth
I am getting this error/warning:
Warning messages:
1: Computation failed in `stat_smooth()`:
object 'weight' not found
2: Computation failed in `stat_smooth()`:
object 'weight' not found
Why am I getting this? And what is the proper method of specifying a custom model to geom_smooth
. Thanks.
In R we can use the geom_smooth() function to represent a regression line and smoothen the visualization. Parameters: method: It is the smoothing method (function) to use for smoothing the line.
To add a regression line on a scatter plot, the function geom_smooth() is used in combination with the argument method = lm . lm stands for linear model.
The warning geom_smooth() using formula 'y ~ x' is not an error. Since you did not supply a formula for the fit, geom_smooth assumed y ~ x, which is just a linear relationship between x and y.
stat_smooth: Add a smoother.Aids the eye in seeing patterns in the presence of overplotting.
The regression function for a regression model with two continuous predictor variables and a continuous outcome lives in a 3D space (two for the predictors, one for the outcome), while the ggplot graph is a 2D space (one continuous predictor on the x-axis and the outcome on the y-axis). That's the fundamental reason why you can't plot a function of two continuous predictor variables with geom_smooth
.
One "workaround" is to pick a few specific values of one of the continuous predictor variables and then plot a line for the other continuous predictor on the x-axis for each of the chosen values of the first variable.
Here's an example with the mtcars
data frame. The regression model below predicts mpg
using wt
and hp
. We then plot the predictions of mpg
vs. wt
for various values of hp
. We create a data frame of predictions and then plot using geom_line
. Each line in the graph represents the regression prediction for mpg
vs. wt
for different values of hp
. You can, of course, also reverse the roles of wt
and hp
.
library(ggplot)
theme_set(theme_classic())
d = mtcars
m2 = lm(mpg ~ wt + hp, data=d)
pred.data = expand.grid(wt = seq(min(d$wt), max(d$wt), length=20),
hp = quantile(d$hp))
pred.data$mpg = predict(m2, newdata=pred.data)
ggplot(pred.data, aes(wt, mpg, colour=factor(hp))) +
geom_line() +
labs(colour="HP Quantiles")
Another option is to use a colour gradient to represent mpg
(the outcome) and plot wt
and hp
on the x and y axes:
pred.data = expand.grid(wt = seq(min(d$wt), max(d$wt), length=100),
hp = seq(min(d$hp), max(d$hp), length=100))
pred.data$mpg = predict(m2, newdata=pred.data)
ggplot(pred.data, aes(wt, hp, z=mpg, fill=mpg)) +
geom_tile() +
scale_fill_gradient2(low="red", mid="yellow", high="blue", midpoint=median(pred.data$mpg))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With