I run a loop 200 times within which I:
randomly divide my dataset into training and test sets
fit a neural networks model with R's nnet()
command on the training set
assess performance on the test set
I save each model to a list.
Now I want to use a combined model to make out-of-sample predictions. I've used the combine
function for this purpose on randomForest
objects. Is there a similar combine command for nnet
objects?
I cannot upload the dataset but below is the code I'm currently working with. It works as is, except for the last line where I seek a command to combine models.
n <- 200
nnet_preds <- matrix(NA, ncol = 1, nrow = n)
nnet_predstp <- matrix(NA, ncol = 1, nrow = n)
nnet_predstn <- matrix(NA, ncol = 1, nrow = n)
nnet_predsfptp <- matrix(NA, ncol = 1, nrow = n)
nnet_predsfntp <- matrix(NA, ncol = 1, nrow = n)
NN_predictions <- matrix(NA, ncol = 1, nrow = 10)
outcome_2010_NN <- testframe2[, "ytest"]
nn_model <- vector("list", n)
data<-testframe2[, c("sctownpop", "sctownhh", "scnum_tribes", "sctownmusl_percap", "scmuslim", "scprop_stranger", "sctownstrg_percap", "scprop_domgroup", "scexcom", "sctownexcm_percap", "sctownretn_percap", "scprop_under30", "scprop_male", "scedulevel", "scprop_noeduc", "scprop_anypeace", "scgroup_prtcptn", "scpubcontr", "scsafecommdum", "scciviccommdum", "scoll_action_add", "scngodependent", "scgovtdependent", "scpolicourtscorr", "screlmarry", "scmslmnolead", "sccrime_scale", "scviol_scale", "sclandconf_scale", "sctownnlnd_percap", "scnolandnofarm", "scfarmocc", "scunemployed", "scwealthindex", "scwealth_inequality", "scviol_experienced", "scviol_part", "scanylndtake", "scdisp_ref", "sfacilities", "sfreq_visits", "sctot_resources", "scmeanprice_above75", "scmosquesdum", "scmnrt_ldrshp", "scany_majorconf", "sstate_presence", "sremote", "scmobilec", "scradio_low")]
data = cbind(outcome_2010_NN, data)
sampleSplit = round(nrow(data)-(nrow(data)/5))
for(i in 1:n) {
set.seed(06511+i)
data_randomization <- data[sample(1:nrow(data), dim(data)[1], replace=FALSE), ]
train <- data_randomization[1:sampleSplit, ]
test <- data_randomization[(sampleSplit+1):nrow(data), ]
nn_model[[i]] <- nnet(outcome_2010_NN ~ sctownpop + sctownhh+ scnum_tribes+ sctownmusl_percap+ scmuslim+ scprop_stranger+ sctownstrg_percap+ scprop_domgroup+ scexcom+ sctownexcm_percap+ sctownretn_percap+ scprop_under30 + scprop_male+ scedulevel+ scprop_noeduc+ scprop_anypeace+ scgroup_prtcptn+ scpubcontr+ scsafecommdum+ scciviccommdum+ scoll_action_add+ scngodependent+ scgovtdependent+ scpolicourtscorr+ screlmarry+ scmslmnolead+ sccrime_scale+ scviol_scale+ sclandconf_scale+ sctownnlnd_percap+ scnolandnofarm+ scfarmocc+ scunemployed+ scwealthindex+ scwealth_inequality+ scviol_experienced+ scviol_part+ scanylndtake+ scdisp_ref+ sfacilities+ sfreq_visits+ sctot_resources+ scmeanprice_above75+ scmosquesdum+ scmnrt_ldrshp+ scany_majorconf+ sstate_presence+ sremote+ scmobilec+ scradio_low,
data=train, size = 3, decay = 0.1)# size=number of units/nodes in the (single_hidden layer); decay=parameter for weight decay. Default 0.
predictions <- predict(nn_model[[i]], test)
nnpredorder<-rank(predictions)
nncvpredictionsA50 <- ifelse( nnpredorder > 24, 1, 0 ) # manually optimized
errors <- table(test[, "outcome_2010_NN"], nncvpredictionsA50)
accuracy.rate <- (errors[1, 1]+errors[2, 2])/sum(errors)
true.pos.rate <- (errors[2, 2]/(errors[2, 2]+errors[2, 1]))
true.neg.rate <- (errors[1, 1]/(errors[1, 1]+errors[1, 2]))
FPTP <- (errors[1, 2]/errors[2, 2])
FNTP <- (errors[2, 1]/errors[2, 2])
nnet_preds[i, ] <- accuracy.rate
nnet_predstp[i, ] <- true.pos.rate
nnet_predstn[i, ] <- true.neg.rate
nnet_predsfptp[i, ] <- FPTP
nnet_predsfntp[i, ] <- FNTP
}
mean(nnet_preds); sd(nnet_preds)
mean(nnet_predstp); sd(nnet_predstp)
NN_predictions[1, ] <- mean(nnet_predstp) # TP accuracy rate (sensitivity)
NN_predictions[2, ] <- sd(nnet_predstp) # TP accuracy rate (sensitivity)
NN_predictions[3, ] <- mean(nnet_predstn) # TN accuracy rate (specificity)
NN_predictions[4, ] <- sd(nnet_predstn) # TN accuracy rate (specificity)
NN_predictions[5, ] <- mean(nnet_preds) # Accuracy rate
NN_predictions[6, ] <- sd(nnet_preds) # Accuracy rate
NN_predictions[7, ] <- mean(nnet_predsfptp) # Ratio FP:TP
NN_predictions[8, ] <- sd(nnet_predsfptp) # Ratio FP:TP
NN_predictions[9, ] <- mean(nnet_predsfntp) # Ratio FN:TP
NN_predictions[10, ] <- sd(nnet_predsfntp) # Ratio FN:TP
print(NN_predictions)
### Combine NN models #Where `combine` is the randomForest command
aggNNmodel <- do.call(combine, nn_model)
You should not be able to use Random Forest's combine method since it is for decision trees. But Random Forest is boosting algorithm therefore you should be able to use a boosting algorithm for combining neural network models.
Boosting is an approach to combine weak learners but there is not rule against using a strong learner like neural network for boosting.
Can a set of weak learners create a single strong learner?
Use a boosting algorithm like AdaBoost with your Neural network as its weak learner. A google search shows couple of boosting packages in R.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With