Collecting out-of-fold predictions from a caret model

Tags:

I want to use the out-of-fold predictions from a caret model to train a second-stage model that includes some of the original predictors. I can collect the out-of-fold predictions as follows:

#Load Data
set.seed(1)
library(caret)
library(mlbench)
data(BostonHousing)

#Build Model (see ?train)
rpartFit <- train(medv ~ . + rm:lstat, data = BostonHousing, method="rpart",
               trControl=trainControl(method='cv', number=folds, 
                                        savePredictions=TRUE))

#Collect out-of-fold predictions
out_of_fold <- rpartFit$pred
bestCP <- rpartFit$bestTune[,'.cp']
out_of_fold <- out_of_fold[out_of_fold$.cp==bestCP,]

Which is great, but they are in the wrong order:

> all.equal(out_of_fold$obs, BostonHousing$medv)
[1] "Mean relative difference: 0.4521906"

I know the train object returns a list of which indexes were used to train each fold:

> str(rpartFit$control$index)
List of 10
 $ Fold01: int [1:457] 1 2 3 4 5 6 7 8 9 10 ...
 $ Fold02: int [1:454] 2 3 4 8 10 11 12 13 14 15 ...
 $ Fold03: int [1:457] 1 2 3 4 5 6 7 8 9 10 ...
 $ Fold04: int [1:455] 1 2 3 5 6 7 8 9 10 11 ...
 $ Fold05: int [1:455] 1 2 3 4 5 6 7 8 9 10 ...
 $ Fold06: int [1:455] 1 2 3 4 5 6 7 8 9 10 ...
 $ Fold07: int [1:457] 1 3 4 5 6 7 8 9 10 13 ...
 $ Fold08: int [1:455] 1 2 4 5 6 7 9 11 12 14 ...
 $ Fold09: int [1:455] 1 2 3 4 5 6 7 8 9 10 ...
 $ Fold10: int [1:454] 1 2 3 4 5 6 7 8 9 10 ...

How can I use this information to put the observations in my out_of_fold object in the same order as the original BostonHousing dataset?

407

asked Jun 29 '12 19:06

Zach

1 Answers

I'll add another column to the output that indicates the original row number for each sample in the next release (probably a month from now).

Max

152

answered Oct 05 '22 04:10

topepo

Related questions
                            
                                Tail recursion in R
                            
                                Is there a way to deal with nested data with sparklyr?
                            
                                Programmatically scraping a response header within R
                            
                                How to identify the function used by geom_smooth()
                            
                                sum non NA elements only, but if all NA then return NA
                            
                                Finding specific strings in an array using R
                            
                                R Shiny authentication using AWS Cognito
                            
                                fuzzy matching in R
                            
                                Stored Input values in shiny widgets?
                            
                                Understanding Keras prediction output of a rnn model in R
                            
                                Prevent pagebreak in kableExtra landscape table
                            
                                How to save a leaflet map with drawn shapes/points on it in Shiny?
                            
                                Write a loop to select all combination of variable values generating positive equation values in R
                            
                                R: how to sample without replacement AND without consecutive same values
                            
                                modify lm or loess function to use it within ggplot2's geom_smooth
                            
                                which list element is being processed when using snowfall::sfLapply?
                            
                                Handling field types in database interaction with R
                            
                                plotting and coloring data on irregular grid
                            
                                ggplot vertical line with date axis
                            
                                How to best join one column of a data.table with another column of the same data.table?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Collecting out-of-fold predictions from a caret model

Tags:

r

r-caret

cross-validation

Zach

People also ask

1 Answers

topepo

Recent Activity

Donate For Us