I'm using Caret's PCI preprocessing.
multinomFit <- train(LoanStatus~.,
train,
method = "multinom",
std=TRUE,
family=binomial,
metric = "ROC",
thresh = 0.85,
verbose = TRUE,
pcaComp=7,
preProcess=c("center", "scale", "pca"),
trControl = ctrl)
I specified, the number of PCA Components to be 7. Why does the summary show the fit using 68 components?
summary(multinomFit)
Call:
multinom(formula = .outcome ~ ., data = dat, decay = param$decay,
std = TRUE, family = ..2, thresh = 0.85, verbose = TRUE,
pcaComp = 7)
Coefficients:
Values Std. Err.
(Intercept) 1.6650694329 0.03760419
PC1 -0.1023790683 0.01474812
PC2 0.0375344688 0.01554707
PC3 -0.1012080589 0.01870754
PC4 -0.1004020357 0.02418817
PC5 0.0707421015 0.02403815
PC6 0.0034671796 0.02535015
PC7 0.1218028495 0.02852909
PC8 0.2191031963 0.03291266
PC9 0.1534144811 0.02986523
PC10 -0.0665337138 0.02999863
PC11 -0.1313662645 0.03032963
PC12 0.0668422208 0.03397493
PC13 0.0002770594 0.03282500
PC14 -0.0883400819 0.03337427
PC15 0.0221726084 0.03323058
PC16 -0.0222984250 0.03210718
PC17 -0.0394014147 0.03282160
PC18 0.0280583827 0.03459664
PC19 -0.0295243295 0.03430506
PC20 -0.0149573710 0.03358775
PC21 0.0653722886 0.03388418
PC22 -0.0114810174 0.03583050
PC23 -0.0594912738 0.03376091
PC24 0.0117123190 0.03476835
PC25 -0.0406770388 0.03507369
PC26 0.0373200991 0.03440807
PC27 0.0050323427 0.03366658
PC28 0.0678087286 0.03516197
PC29 0.0234294196 0.03459586
PC30 0.0540846491 0.03464610
PC31 0.1054946257 0.03459315
PC32 0.0216292907 0.03485001
PC33 0.0247627243 0.03488016
PC34 0.0033126360 0.03402770
PC35 -0.0434168834 0.03468038
PC36 -0.0098687981 0.03497515
PC37 -0.0193788562 0.03268054
PC38 0.0572276670 0.03837009
PC39 0.0535213906 0.03737078
PC40 0.0007157334 0.03321343
PC41 -0.0286461676 0.03546742
PC42 0.0640903943 0.03378855
PC43 -0.0111873647 0.03626063
PC44 -0.0304589978 0.03448459
PC45 0.0191817954 0.03690284
PC46 -0.0330040383 0.03277895
PC47 0.0328641857 0.03460263
PC48 0.0204941541 0.03460759
PC49 0.0345105736 0.04002168
PC50 0.0076131373 0.03621336
PC51 0.0082765068 0.03299395
PC52 -0.0594596197 0.03633509
PC53 -0.0276656822 0.03596515
PC54 0.0411414647 0.03529887
PC55 -0.0644394706 0.03490393
PC56 -0.0266971243 0.03403656
PC57 -0.1415322396 0.03681683
PC58 -0.0332329932 0.03469459
PC59 -0.0273683007 0.03524604
PC60 0.0450430472 0.03586438
PC61 -0.0708218651 0.03807458
PC62 0.1523605734 0.03851722
PC63 -0.0385759566 0.03920662
PC64 -0.0602633030 0.03902837
PC65 0.0547553856 0.03970764
PC66 0.0727331180 0.04273518
PC67 0.1142574406 0.04522347
PC68 -0.1059928013 0.04077592
Residual Deviance: 5273.035
AIC: 5411.035
Finally, is there a way to map the 7 PCA factors which describe 85% of the variation in the data back to 7 input attributes in the original observations?
Thanks in advance.
You can pass pre-processing options via preProcOptions
in trainControl()
, have a look at ?trainControl
. here is an example,
ctrl <- trainControl(method = "repeatedcv",
repeats = 3,
classProbs = TRUE,
preProcOptions = list(thresh = 0.85), #or list(pcaComp = 7)
summaryFunction = twoClassSummary)
multinomFit <- train(LoanStatus~., train,
method = "multinom",
family=binomial,
metric = "ROC",
verbose = TRUE,
preProcess=c("center", "scale", "pca"),
trControl = ctrl)
Notice, if you specify the number of PCA components pcaComp = 7
, that will over-ride thresh
(have a look at ?preProcess
). So use one of them.
You can view the contribution of variables to each PCA component by:
multinomFit$preProcess$rotation
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With