I am trying to bootstrap a simple multinomial regression in R, and I am getting an error:
Error in is.data.frame(data) : object 'd' not found
What is really strange is that I am using the same code (adjusted to this particular problem) as in a tutorial for boot package at Quick-R, and that same code also worked when I am using different function (like lm()). For sure, I am doing something stupid, but I do not see what. Please, if anyone can help, I would appreciate a lot.
This is an example:
require(foreign)
require(nnet)
require(boot)
# an example for multinomial logistic regression
ml = read.dta('http://www.ats.ucla.edu/stat/data/hsbdemo.dta')
ml = ml[,c(5,7,3)]
bs <- function(formula, data, indices) {
d = data[indices,] # allows boot to select sample
fit = multinom(formula, data=d)
s = summary(fit)
return(list(fit$coefficients, fit$standard.errors))
}
# 5 replications
results = list()
results <- boot(
data=ml, statistic=bs, R=5, parallel='multicore',
formula=prog~write
)
The error happens in the summary()
part, also the object returned by multinom()
does not have coefficients
and standard.errors
. It seems, that summary.multinom()
in turn calculates the hessian from your data, d
, which for some reason (probably a scoping issue) cannot be found. A quick fix is to add Hess = TRUE
:
bs <- function(formula, data, indices) {
d = data[indices,] # allows boot to select sample
fit = multinom(formula, data=d, Hess = TRUE)
s = summary(fit)
return( cbind(s$coefficients, s$standard.errors) )
}
# 5 replications
results = list()
results <- boot(
data=ml, statistic=bs, R=5, parallel='multicore',
formula=prog~write
)
Multinomial logistic regression returns a matrix of coefficients using the coef()
function. This differs from a lm
or glm
model which returns a vector of coefficients.
library(foreign) # read.dta()
library(nnet) # multinom()
require(boot) # boot()
# an example for multinomial logistic regression
ml = read.dta('http://www.ats.ucla.edu/stat/data/hsbdemo.dta')
ml = ml[,c(5,7,3)]
names(ml)
bs <- function(formula, data, indices) {
d = data[indices,] # allows boot to select sample
fit = multinom(formula, data=d, maxit=1000, trace=FALSE)
#s = summary(fit)
#return(list(fit$coefficients, fit$standard.errors))
estimates <- coef(fit)
return(t(estimates))
}
# enable parallel
library(parallel)
cl <- makeCluster(2)
clusterExport(cl, "multinom")
# 10000 replications
set.seed(1984)
results <- boot(
data=ml, statistic=bs, R=10000, parallel = "snow", ncpus=2, cl=cl,
formula=prog~write
)
# label the estimates
subModelNames <- colnames(results$t0)
varNames <- rownames(results$t0)
results$t0
estNames <- apply(expand.grid(varNames,subModelNames),1,function(x) paste(x,collapse="_"))
estNames
colnames(results$t) <- estNames
# summary of results
library(car)
summary(results)
confint(results, level=0.95, type="norm")
confint(results, level=0.95, type="perc")
confint(results, level=0.95, type="bca")
# plot the results
hist(results, legend="separate")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With