I would like to export the following model below so an other user can open it and use predict
function to predict classes on new observation. That is the only thing it will be used for. I can save mod_fit, but it will take up lots of space and the end user can access information which I dont want. Is there any easy way?
library(caret)
library(dplyr)
iris2 <- iris %>% filter(Species != "setosa") %>% mutate(Species = as.character(Species))
mod_fit <- train(Species ~., data = iris2, method = "glm")
The following is a generic procedure of trimming down R objects from data that might not be necessary for the target use. It's heuristic in nature, but I've already applied it successfully twice, and with a bit of luck it works quite well.
You can measure object size using a function called object.size
:
> object.size(mod_fit)
528616 bytes
Indeed, quite a lot for a linear model with four predictors. You can inspect what's inside the object using, for example, the str
function:
> str(mod_fit)
List of 23
$ method : chr "glm"
$ modelInfo :List of 15
..$ label : chr "Generalized Linear Model"
..$ library : NULL
..$ loop : NULL
..$ type : chr [1:2] "Regression" "Classification"
..$ parameters:'data.frame': 1 obs. of 3 variables:
.. ..$ parameter: Factor w/ 1 level "parameter": 1
.. ..$ class : Factor w/ 1 level "character": 1
[…]
$ coefnames : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
$ xlevels : Named list()
- attr(*, "class")= chr [1:2] "train" "train.formula"
Quite a lot of data. So, let's check how much space each of these elements take:
> sort(sapply(mod_fit, object.size))
pred preProcess yLimits dots maximize method
0 0 0 40 48 96
modelType metric perfNames xlevels coefnames levels
104 104 160 192 296 328
call bestTune results times resample resampledCM
936 1104 1584 2024 2912 4152
trainingData terms control modelInfo finalModel
5256 6112 29864 211824 259456
Now we can try removing elements from this object one-by-one, and check which are necessary for predict
to work, starting from the largest:
> test_obj <- mod_fit; test_obj$finalModel <- NULL; predict(test_obj, iris2)
Error in if (modelFit$problemType == "Classification") { :
argument is of length zero
Whoops, finalModel
seems important. Any kind of error here tells you that you can't remove the element. How about, let say, control
?
> test_obj <- mod_fit; test_obj$control <- NULL; predict(test_obj, iris2)
[1] versicolor versicolor versicolor versicolor versicolor versicolor
[7] versicolor versicolor versicolor versicolor versicolor versicolor
[13] versicolor versicolor versicolor versicolor versicolor versicolor
[…]
[97] virginica virginica virginica virginica
Levels: versicolor virginica
So, it seems that control
is not needed. You can perform this process recursively, for example:
> sort(sapply(mod_fit$finalModel, object.size))
offset contrasts param rank
0 0 40 48
[…]
model family
17056 163936
> sort(sapply(mod_fit$finalModel$family, object.size))
link family valideta linkfun linkinv mu.eta dev.resids
96 104 272 560 560 560 1992
variance validmu initialize aic simulate
2064 6344 18712 27512 103888
> test_obj <- mod_fit; test_obj$finalModel$family$simulate <- NULL; predict(test_obj, iris2)
[1] versicolor versicolor versicolor versicolor versicolor versicolor
[…]
[97] virginica virginica virginica virginica
Levels: versicolor virginica
With enough attempts you will know which parts of the object are necessary, and which are not—and remove them before storing the model.
Note: while this may reduce unnecessary parts of the object, you may accidentally remove parts that are only sometimes used in prediction. For simple models that always work the same way, like glm
, this should not happen, though.
Also, the result of this process is not guaranteed not to leak information about the model you don't want the model's user to see. There is no such guarantee in general, and there are methods of reconstructing significant information about models and training data even from black-box models that are not usually easy to interpret.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With