How to obtain a confidence interval or a measure of prediction dispersion when using xgboost for classification?
So for example, if xgboost predicts a probability of an event is 0.9, how can the confidence in that probability be obtained?
Also is this confidence assumed to be heteroskedastic?
To produce confidence intervals for xgboost model you should train several models (you can use bagging for this). Each model will produce a response for test sample - all responses will form a distribution from which you can easily compute confidence intervals using basic statistics. You should produce response distribution for each test sample.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With