I have a question about the formula and the user defined function:
clotting <- data.frame(
u = c(5,10,15,20,30,40,60,80,100),
lot1 = c(118,58,42,35,27,25,21,19,18),
lot2 = c(69,35,26,21,18,16,13,12,12))
g1 = glm(lot1 ~ log(u) + poly(u,1), data = clotting, family = Gamma)
dc = clotting
dc$u = 1
predict(g1, dc)
1 2 3 4 5 6 7 8 9
-0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929
However, if I just simply wrap the poly as a user defined function ( in reality I would have my own more complex function) then I will get error:
xpoly <- function(x, degree=1){poly(x,degree)}
g2 = glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma)
predict(g2, dc)
Error in poly(x, degree) :
'degree' must be less than number of unique points
It seems that the predict treats the user defined function in the formula with I(). My question is how can I get the results for Case2 same as case1?
Anyone can have any idea about this?
poly
is a bit of a unique function here. By default, it returns a set of orthogonal polynomials so it's doing some centering and rescaling of the data. If you want to be able to predict using the coefficients from the fitted model, you would need to transform new data in the same way that was done with the original data. This means some additional data must be passed along.
First I will point out that if you use the raw, non-orthogonal values, you would not experience this problem.
g1 <- glm(lot1 ~ log(u) + poly(u,1, raw=T), data = clotting, family = Gamma)
xpoly<-function(x,degree=1){poly(x,degree, raw=T)}
g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma)
dc=clotting
dc$u=1
predict(g1,dc)
# 1 2 3 4 5 6 7 8 9
#-0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929
predict(g2,dc)
# 1 2 3 4 5 6 7 8 9
#-0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929
But let's further explore how poly
passing along the scaling information to predict
. The work actually happens in the model.frame
function. Compare these two results
attr(terms(model.frame(lot1 ~ log(u) + poly(u,1), clotting)), "predvar")
# list(lot1, log(u), poly(u, 1, coefs = list(alpha = 40, norm2 = c(1,
9, 8850))))
attr(terms(model.frame(lot1 ~ log(u) + xpoly(u,1), clotting)), "predvar")
# list(lot1, log(u), xpoly(u, 1))
You can see that the call to poly()
in the first formula has been adjusted in the predvar
attribute of the formula that's returned. This done in the model.frame
code
...
if (is.null(attr(formula, "predvars"))) {
for (i in seq_along(varnames)) predvars[[i + 1L]] <- makepredictcall(variables[[i]],
vars[[i + 1L]])
attr(formula, "predvars") <- predvars
}
...
Note that it calls the makepredictcall()
function which is a generic function which dispatches based on the class of the returned object. It happens that poly
returns an object of class "poly"
class(poly(1:5, 1))
# [1] "poly" "matrix"
So what get's called for the "poly" data is this function
stats:::makepredictcall.poly
function (var, call)
{
if (as.character(call)[1L] != "poly")
return(call)
call$coefs <- attr(var, "coefs")
call
}
<bytecode: 0x123262178>
<environment: namespace:stats>
This is where the coef=
attributes are added. But also note that it checks that the call was from the "poly" function itself. Since your function is named "xpoly" but returns a "poly" object, the coefficient information isn't returned. One workaround would be to change the return class of your object and create your own makepredictcall
function. For example you could do
xpoly <- function(...){p<-poly(...); class(p)[1]<-"xpoly"; p}
makepredictcall.xpoly <- function(var, call) {
call$coefs <- attr(var, "coefs")
call
}
Note that this new version of xpoly
will also accept the coef=
argument and pass it along to poly()
via the ...
parameters. Then you can run
g1 <- glm(lot1 ~ log(u) + poly(u,1), data = clotting, family = Gamma)
g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma)
predict(g1,dc)
# 1 2 3 4 5 6 7 8 9
#-0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929
predict(g2,dc)
# 1 2 3 4 5 6 7 8 9
#-0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With