I have a class imbalance problem and been experimenting with a weighted Random Forest using the implementation in scikit-learn (>= 0.16).
I have noticed that the implementation takes a class_weight parameter in the tree constructor and sample_weight parameter in the fit method to help solve class imbalance. Those two seem to be multiplied though to decide a final weight.
I have trouble understanding the following:
(The parameters of a random forest are the variables and thresholds used to split each node learned during training). Scikit-Learn implements a set of sensible default hyperparameters for all models, but these are not guaranteed to be optimal for a problem.
sample_weight augments the probability estimates in the probability array ... which augments the impurity measure ... which augments how nodes are split ... which augments how the tree is built ... which augments how feature space is diced up for classification.
One of the most important features of the Random Forest Algorithm is that it can handle the data set containing continuous variables as in the case of regression and categorical variables as in the case of classification. It performs better results for classification problems.
RandomForests are built on Trees, which are very well documented. Check how Trees use the sample weighting:
As for the difference between class_weight
and sample_weight
: much can be determined simply by the nature of their datatypes. sample_weight
is 1D array of length n_samples
, assigning an explicit weight to each example used for training. class_weight
is either a dictionary of each class to a uniform weight for that class (e.g., {1:.9, 2:.5, 3:.01}
), or is a string telling sklearn how to automatically determine this dictionary.
So the training weight for a given example is the product of it's explicitly named sample_weight
(or 1
if sample_weight
is not provided), and it's class_weight
(or 1
if class_weight
is not provided).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With