I am trying to implement segmented regression as per this example Segmented Regression, Breakpoint analysis.
Now, how can i implement it in such a way the second part will be quadratic polynomial and remaining other things same.
I tried the same by changing Z= ~poly(DistanceMeters, 2)
however it didn't work.
Also, How can I get equations like
part 1: a1*x+b1
part 2: a2*x2**2 + b2*x + c1
part 3 :a3*x + b3
There are similar questions like this however they din't explain using segmented function.
By finding the differences between dependent values, you can determine the degree of the model for data given as ordered pairs. If the first difference is the same value, the model will be linear. If the second difference is the same value, the model will be quadratic.
Although this model allows for a nonlinear relationship between Y and X, polynomial regression is still considered linear regression since it is linear in the regression coefficients, β1,β2,...,βh β 1 , β 2 , . . . , β h !
What is the difference between quadratic regression and simple linear regression? Simple linear regression is used to find the equation of the straight line that best fits a set of data while quadratic regression is used to find the equation of the parabola that best fits a set of data.
A polynomial term–a quadratic (squared) or cubic (cubed) term turns a linear regression model into a curve. But because it is X that is squared or cubed, not the Beta coefficient, it still qualifies as a linear model.
I have two ideas, both with drawbacks. Maybe you can adjust one of them to your needs. Unfortunately cannot access Drive at the moment, so some artificial data used.
1. "Manually" fit polynomial models
Here you can specify whichever models you like, some segments can be lm's, some polynomials etc.
Code:
library(segmented)
library(ggplot2)
library(data.table)
# Data
set.seed(12)
xx <- 1:100
yy <- 2 + 1.5 * pmax(xx-35, 0) - 1.5 * pmax(xx-70, 0) + 15 * pmax(runif(100) - 0.5, 0) + rnorm(100, 0, 2)
dt <- data.table(x = xx, y = yy, type = 'act')
dt_all <- copy(dt)
# lm
lm_lin <- lm(y ~ x, data = dt)
summary(lm_lin)
# Find segments
lm_seg <- segmented(
lm_lin, seg.Z = ~ x, psi = list(x = c(30, 80)))
# "Manual" lm's
breaks <- unname(lm_seg$psi[, 'Est.'])
lm_poly1 <- lm(y ~ poly(x, 4), data = dt[x < breaks[1], ])
lm_2 <- lm(y ~ x, data = dt[x > breaks[1] & x < breaks[2], ])
lm_poly3 <- lm(y ~ poly(x, 4), data = dt[x > breaks[2], ])
dt_all <- rbind(
dt_all,
data.table(x = xx, y = c(
predict(lm_poly1),
predict(lm_2),
predict(lm_poly3)),
type = 'lm_poly'
)
)
2. Fit a gam model using breaks from segmented
and some splines
Here you will get a smooth transition between segments, but you have much less control on what is happening.
# Using splines for smooth segments
library(mgcv)
spl <- gam(y ~ s(x, bs = "cc", k = 12), data = dt, knots = list(xx = breaks))
# Plot
dt_all <- rbind(dt_all, data.table(x = xx, y = predict(spl), type = 'spl'))
ggplot(dt_all, aes(x = x, y = y)) + geom_point(size = 1) +
facet_grid(. ~ type) + theme_minimal()
Both can be done using e.g. list()
and lapply()
to automate a bit (for varying number of breaks etc.).
Edit:
By changing arguments of poly
and s
you can get slightly "better" fitting models, but for gam
errors on the edges are quite big, see for degree = 6
and k = 30
:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With