I implemented Adaboost for a project, but I'm not sure if I've understood adaboost correctly. Here's what I implemented, please let me know if it is a correct interpretation.
Now I use adaboost. My interpretation of adaboost is that it will find a final classifier as a weighted average of the classifiers I have trained above, and its role is to find these weights. So, for every training example I have 8 predictions, and I'm combining them using adaboost weights. Note that with this interpretation, the weak classifiers are not retrained during the adaboost iterations, only the weights are updated. But the updated weights in effect create new classifiers in each iteration.
Here's the pseudo code:
all_alphas = []
all_classifier_indices = []
initialize all training example weights to 1/(num of examples)
compute error for all 8 networks on the training set
for i in 1 to T:
find the classifier with lowest weighted error.
compute the weights (alpha) according to the Adaboost confidence formula
Update the weight distribution, according to the weight update formula in Adaboost.
all_alphas.append(alpha)
all_classifier_indices.append(selected_classifier)
After T
iterations, there are T
alphas and T
classifier indices ; these T
classifier indices will point to one of the 8 neural net prediction vectors.
Then on the test set, for every example, I predict by summing over alpha*classifier
.
I want to use adaboost with neural networks, but I think I've misinterpreted the adaboost algorithm wrong..
In this paper, the capability of Adaptive Boosting (AdaBoost) is integrated with a Convolutional Neural Network (CNN) to design a new machine learning method, AdaBoost-CNN, which can deal with large imbalanced datasets with high accuracy. AdaBoost is an ensemble method where a sequence of classifiers is trained.
Compared to random forests and XGBoost, AdaBoost performs worse when irrelevant features are included in the model as shown by my time series analysis of bike sharing demand. Moreover, AdaBoost is not optimized for speed, therefore being significantly slower than XGBoost.
AdaBoost can be used to boost the performance of any machine learning algorithm. It is best used with weak learners. These are models that achieve accuracy just above random chance on a classification problem. The most suited and therefore most common algorithm used with AdaBoost are decision trees with one level.
AdaBoost was the first really successful boosting algorithm developed for the purpose of binary classification. AdaBoost is short for Adaptive Boosting and is a very popular boosting technique that combines multiple “weak classifiers” into a single “strong classifier”.
AdaBoost-based artificial neural network learning. Abstract. A boosting-based method of learning a feed-forward artificial neural network (ANN) with a single layer of hidden neurons and a single output neuron is presented. Initially, an algorithm called Boostron is described that learns a single-layer perceptron using AdaBoost and decision stumps.
The AdaBoost algorithm [12], [18] is one of the most commonly used ensemble learning algorithms that can be used to construct a highly accurate classifier ensemble from a moderately accurate learning algorithm [19].
Initially, an algorithm called Boostron is described that learns a single-layer perceptron using AdaBoost and decision stumps. It is then extended to learn weights of a neural network with a single hidden layer of linear neurons.
Here are some (fun) facts about Adaboost! → The weak learners in AdaBoost are decision trees with a single split, called decision stumps. → AdaBoost works by putting more weight on difficult to classify instances and less on those already handled well. → AdaBoost algorithms can be used for both classification and regression problem.
Boosting summary:
1- Train your first weak classifier by using the training data
2- The 1st trained classifier makes mistake on some samples and correctly classifies others. Increase the weight of the wrongly classified samples and decrease the weight of correct ones. Retrain your classifier with these weights to get your 2nd classifier.
In your case, you first have to resample with replacement from your data with these updated weights, create a new training data and then train your classifier over these new data.
3- Repeat the 2nd step T times and at the end of each round, calculate the alpha weight for the classifier according to the formula. 4- The final classifier is the weighted sum of the decisions of the T classifiers.
It is hopefully clear from this explanation that you have done it abit wrongly. Instead of retrain your network with the new data set, you trained them all over the original dataset. In fact you are kind of using random forest type classifier (except that you are using NN instead of decision trees) ensemble.
PS: There is no guarantee that boosting increases the accuracy. In fact, so far all the boosting methods that I'm aware of were unsuccessful to improve the accuracy with NN as weak learners (The reason is because of the way that boosting works and needs a lengthier discussion).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With