What is the majority vote algorithm used in Weka. I tried to figure out its code but could not understand it.
In Weka you can select multiple classifiers to be used in Weka.classifiers.meta.vote
. If you select Majority Voting
as combinationRule
(which only works with nominal
classes), then each of these classifiers will predict a nominal class label for a test sample. The label which was predicted the most will then be selected as output of the vote
classifier.
For example. You select the following classifiers to be used: trees.J48
, bayes.NaiveBayes
and functions.LibSVM
to predict the weather, which can be labelled bad
, normal
or good
. Given a new test sample, these are their predictions:
J48 - bad
NaiveBayes - good
LibSVM - good
The results in the following votes for each possible label:
bad - 1
normal - 0
good - 2
So Weka's vote
classifier will select good
as label for the test sample, because it has the most votes amongst all three classifiers.
--Edit--
The function distributionForInstanceMajorityVoting
in the source code of Weka's Vote
class shows you how the majority voting is implemented. I added the function below. Here is a description of what it does:
The code works pretty much as I explained above. All nominal classes of the test instance are loaded into votes
. Each classifier classifies the instance and the label with the highest probability gets a vote. If multiple labels have the same probability then all these labels receive a vote. Once all classifiers have cast there vote, the label with the most votes is selected as the label for the test instance. If multiple labels have the same amount of votes, then one of these labels will randomly be selected.
protected double[] distributionForInstanceMajorityVoting(Instance instance) throws Exception {
double[] probs = new double[instance.classAttribute().numValues()];
double[] votes = new double[probs.length];
for (int i = 0; i < m_Classifiers.length; i++) {
probs = getClassifier(i).distributionForInstance(instance);
int maxIndex = 0;
for(int j = 0; j<probs.length; j++) {
if(probs[j] > probs[maxIndex])
maxIndex = j;
}
// Consider the cases when multiple classes happen to have the same probability
for (int j=0; j<probs.length; j++) {
if (probs[j] == probs[maxIndex])
votes[j]++;
}
}
for (int i = 0; i < m_preBuiltClassifiers.size(); i++) {
probs = m_preBuiltClassifiers.get(i).distributionForInstance(instance);
int maxIndex = 0;
for(int j = 0; j<probs.length; j++) {
if(probs[j] > probs[maxIndex])
maxIndex = j;
}
// Consider the cases when multiple classes happen to have the same probability
for (int j=0; j<probs.length; j++) {
if (probs[j] == probs[maxIndex])
votes[j]++;
}
}
int tmpMajorityIndex = 0;
for (int k = 1; k < votes.length; k++) {
if (votes[k] > votes[tmpMajorityIndex])
tmpMajorityIndex = k;
}
// Consider the cases when multiple classes receive the same amount of votes
Vector<Integer> majorityIndexes = new Vector<Integer>();
for (int k = 0; k < votes.length; k++) {
if (votes[k] == votes[tmpMajorityIndex])
majorityIndexes.add(k);
}
// Resolve the ties according to a uniform random distribution
int majorityIndex = majorityIndexes.get(m_Random.nextInt(majorityIndexes.size()));
//set probs to 0
probs = new double[probs.length];
probs[majorityIndex] = 1; //the class that have been voted the most receives 1
return probs;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With