Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use the predict function in R in a logistic regression fitted years ago?

I have a problem that I am trying to resolve with no success. More than two days searching and I didn’t get a single clue. Sorry if the answer is out there and I didn’t find it.

Suppose that you have a logistic equation regression (binary model) from an old model that you estimated some years ago. Therefore you know the parameters βk (k = 1, 2, ..., p) because they were estimated in the past. But you don’t have the data that were used to fit the model.

My question is: can I introduce this old estimated logistic model in R as an object (corresponding to a logistic regression model)?

I would like to use the “predict” function to prove this logistic regression with a new set of data (present data) and then check the validity of this old model standing the test of time. And to use this function you need the object of the logistic regression model.

Thank you very much in advance.

like image 750
perevales Avatar asked Oct 09 '13 08:10

perevales


People also ask

What does predict function do in R for logistic regression?

The predict() function can be used to predict the probability that the market will go up, given values of the predictors. The type="response" option tells R to output probabilities of the form P(Y = 1|X) , as opposed to other information such as the logit .

How do you find the predicted value in logistic regression?

Logistic function The standard logistic regression function, for predicting the outcome of an observation given a predictor variable (x), is an s-shaped curve defined as p = exp(y) / [1 + exp(y)] (James et al.

How do you find the predicted value in R?

The predict() function in R is used to predict the values based on the input data. All the modeling aspects in the R program will make use of the predict() function in their own way, but note that the functionality of the predict() function remains the same irrespective of the case.

Can we predict using logistic regression?

Logistic regression is a classification algorithm. It is used to predict a binary outcome based on a set of independent variables.


2 Answers

Per my comment, I think you could start by just calculating predictions directly from the coefficients. Here's an example that compares the output from predict.glm to predicted probabilities calculated directly on the data:

# construct some data and model it
# y ~ x1 + x2
set.seed(1)
x1 <- runif(100)
x2 <- runif(100)
y <- rbinom(100,1,(x1+x2)/2)
data1 <- data.frame(x1=x1,x2=x2,y=y)
x3 <- runif(100)
x4 <- runif(100)
y2 <- rbinom(100,1,(x3+x4)/2)
data2 <- data.frame(x1=x3,x2=x4,y=y2)
glm1 <- glm(y~x1+x2,data=data1,family=binomial)

# extract coefs
#summary(glm1)
coef1 <- coef(glm1)

# calculate predicted probabilities for current data
tmp1 <- coef1[1] + (data1$x1*coef1[2]) + (data1$x2*coef1[3])
pr1 <- 1/(1+(1/exp(tmp1)))
# these match those from `predict`:
all.equal(pr1,predict(glm1,data1,type='response'))

# now apply to new data:
tmp2 <- coef1[1] + (data2$x1*coef1[2]) + (data2$x2*coef1[3])
pr2 <- 1/(1+(1/exp(tmp2)))
pr2

This is obviously not a general solution, nor does it properly handle uncertainty, but I think it's a better approach than hacking predict.

like image 143
Thomas Avatar answered Sep 29 '22 14:09

Thomas


You can create a glm fit with only an offset created from the coefficients that you have, then use the regular predict function with that. For example using the iris data (first fitting a model on the real data, then fitting a new model using dummy data and the coefficients from the first fit):

fit1 <- glm( I(Species=='versicolor') ~ Petal.Length + Petal.Width, 
   data=iris, family=binomial )
coef(fit1)

dummydata <- data.frame( Petal.Length = rnorm(10), Petal.Width=rnorm(10),
    Species = rep(c('versicolor','other'), each=5) )

fit2 <- glm( I(Species=='versicolor') ~ 0 + 
  offset(-2.863708 + 1.563076*Petal.Length - 3.153165*Petal.Width),
    data=dummydata, family=binomial )

pred1 <- predict(fit1, newdata=iris)
pred2 <- predict(fit2, newdata=iris)
plot(pred1,pred2)
abline(0,1, col='green')
like image 25
Greg Snow Avatar answered Sep 29 '22 15:09

Greg Snow