I am trying to bootstrap a simple multinomial regression in R, and I am getting an error:
Error in is.data.frame(data) : object 'd' not found
What is really strange is that I am using the same code (adjusted to this particular problem) as in a tutorial for boot package at Quick-R, and that same code also worked when I am using different function (like lm()). For sure, I am doing something stupid, but I do not see what. Please, if anyone can help, I would appreciate a lot.
This is an example:
require(foreign)
require(nnet)
require(boot)
# an example for multinomial logistic regression
ml = read.dta('http://www.ats.ucla.edu/stat/data/hsbdemo.dta')
ml = ml[,c(5,7,3)]
bs <- function(formula, data, indices) {
d = data[indices,] # allows boot to select sample
fit = multinom(formula, data=d)
s = summary(fit)
return(list(fit$coefficients, fit$standard.errors))
}
# 5 replications
results = list()
results <- boot(
data=ml, statistic=bs, R=5, parallel='multicore',
formula=prog~write
)
The error happens in the summary() part, also the object returned by multinom() does not have coefficients and standard.errors. It seems, that summary.multinom() in turn calculates the hessian from your data, d, which for some reason (probably a scoping issue) cannot be found. A quick fix is to add Hess = TRUE:
bs <- function(formula, data, indices) {
d = data[indices,] # allows boot to select sample
fit = multinom(formula, data=d, Hess = TRUE)
s = summary(fit)
return( cbind(s$coefficients, s$standard.errors) )
}
# 5 replications
results = list()
results <- boot(
data=ml, statistic=bs, R=5, parallel='multicore',
formula=prog~write
)
Multinomial logistic regression returns a matrix of coefficients using the coef() function. This differs from a lm or glm model which returns a vector of coefficients.
library(foreign) # read.dta()
library(nnet) # multinom()
require(boot) # boot()
# an example for multinomial logistic regression
ml = read.dta('http://www.ats.ucla.edu/stat/data/hsbdemo.dta')
ml = ml[,c(5,7,3)]
names(ml)
bs <- function(formula, data, indices) {
d = data[indices,] # allows boot to select sample
fit = multinom(formula, data=d, maxit=1000, trace=FALSE)
#s = summary(fit)
#return(list(fit$coefficients, fit$standard.errors))
estimates <- coef(fit)
return(t(estimates))
}
# enable parallel
library(parallel)
cl <- makeCluster(2)
clusterExport(cl, "multinom")
# 10000 replications
set.seed(1984)
results <- boot(
data=ml, statistic=bs, R=10000, parallel = "snow", ncpus=2, cl=cl,
formula=prog~write
)
# label the estimates
subModelNames <- colnames(results$t0)
varNames <- rownames(results$t0)
results$t0
estNames <- apply(expand.grid(varNames,subModelNames),1,function(x) paste(x,collapse="_"))
estNames
colnames(results$t) <- estNames
# summary of results
library(car)
summary(results)
confint(results, level=0.95, type="norm")
confint(results, level=0.95, type="perc")
confint(results, level=0.95, type="bca")
# plot the results
hist(results, legend="separate")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With