Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

geom_smooth custom linear model

While looking at this issue, I couldn't specify a custom linear model to geom_smooth. My code is as follows:

example.label <- c("A","A","A","A","A","B","B","B","B","B")
example.value <- c(5, 4, 4, 5, 3, 8, 9, 11, 10, 9)
example.age <- c(30, 40, 50, 60, 70, 30, 40, 50, 60, 70)
example.score <- c(90,95,89,91,85,83,88,94,83,90)
example.data <- data.frame(example.label, example.value,example.age,example.score)

p = ggplot(example.data, aes(x=example.age,
                         y=example.value,color=example.label)) +
  geom_point()
  #geom_smooth(method = lm)

cf = function(dt){
  lm(example.value ~example.age+example.score, data = dt)
}

cf(example.data)

p_smooth <- by(example.data, example.data$example.label, 
               function(x) geom_smooth(data=x, method = lm, formula = cf(x)))

p + p_smooth 

I am getting this error/warning:

Warning messages:
1: Computation failed in `stat_smooth()`:
object 'weight' not found 
2: Computation failed in `stat_smooth()`:
object 'weight' not found 

Why am I getting this? And what is the proper method of specifying a custom model to geom_smooth. Thanks.

like image 699
AK88 Avatar asked Jun 27 '17 05:06

AK88


People also ask

Is Geom_smooth linear regression?

In R we can use the geom_smooth() function to represent a regression line and smoothen the visualization. Parameters: method: It is the smoothing method (function) to use for smoothing the line.

What does Geom_smooth method lm do?

To add a regression line on a scatter plot, the function geom_smooth() is used in combination with the argument method = lm . lm stands for linear model.

What does Geom_smooth () using formula YX mean?

The warning geom_smooth() using formula 'y ~ x' is not an error. Since you did not supply a formula for the fit, geom_smooth assumed y ~ x, which is just a linear relationship between x and y.

What does Stat_smooth method lm do?

stat_smooth: Add a smoother.Aids the eye in seeing patterns in the presence of overplotting.


Video Answer


1 Answers

The regression function for a regression model with two continuous predictor variables and a continuous outcome lives in a 3D space (two for the predictors, one for the outcome), while the ggplot graph is a 2D space (one continuous predictor on the x-axis and the outcome on the y-axis). That's the fundamental reason why you can't plot a function of two continuous predictor variables with geom_smooth.

One "workaround" is to pick a few specific values of one of the continuous predictor variables and then plot a line for the other continuous predictor on the x-axis for each of the chosen values of the first variable.

Here's an example with the mtcars data frame. The regression model below predicts mpg using wt and hp. We then plot the predictions of mpg vs. wt for various values of hp. We create a data frame of predictions and then plot using geom_line. Each line in the graph represents the regression prediction for mpg vs. wt for different values of hp. You can, of course, also reverse the roles of wt and hp.

library(ggplot)
theme_set(theme_classic())

d = mtcars
m2 = lm(mpg ~ wt + hp, data=d)

pred.data = expand.grid(wt = seq(min(d$wt), max(d$wt), length=20),
                        hp = quantile(d$hp))
pred.data$mpg = predict(m2, newdata=pred.data)

ggplot(pred.data, aes(wt, mpg, colour=factor(hp))) +
  geom_line() +
  labs(colour="HP Quantiles")

enter image description here

Another option is to use a colour gradient to represent mpg (the outcome) and plot wt and hp on the x and y axes:

pred.data = expand.grid(wt = seq(min(d$wt), max(d$wt), length=100),
                        hp = seq(min(d$hp), max(d$hp), length=100))
pred.data$mpg = predict(m2, newdata=pred.data)

ggplot(pred.data, aes(wt, hp, z=mpg, fill=mpg)) +
  geom_tile() +
  scale_fill_gradient2(low="red", mid="yellow", high="blue", midpoint=median(pred.data$mpg)) 

enter image description here

like image 153
eipi10 Avatar answered Oct 05 '22 06:10

eipi10