Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Majority vote algorithm in Weka.classifiers.meta.vote

Tags:

weka

What is the majority vote algorithm used in Weka. I tried to figure out its code but could not understand it.

like image 316
belle Avatar asked Jul 24 '12 07:07

belle


1 Answers

In Weka you can select multiple classifiers to be used in Weka.classifiers.meta.vote. If you select Majority Voting as combinationRule (which only works with nominal classes), then each of these classifiers will predict a nominal class label for a test sample. The label which was predicted the most will then be selected as output of the vote classifier.

For example. You select the following classifiers to be used: trees.J48, bayes.NaiveBayes and functions.LibSVM to predict the weather, which can be labelled bad, normal or good. Given a new test sample, these are their predictions:

J48        - bad
NaiveBayes - good
LibSVM     - good

The results in the following votes for each possible label:

bad    - 1
normal - 0
good   - 2

So Weka's vote classifier will select good as label for the test sample, because it has the most votes amongst all three classifiers.

--Edit--

The function distributionForInstanceMajorityVoting in the source code of Weka's Vote class shows you how the majority voting is implemented. I added the function below. Here is a description of what it does:

The code works pretty much as I explained above. All nominal classes of the test instance are loaded into votes. Each classifier classifies the instance and the label with the highest probability gets a vote. If multiple labels have the same probability then all these labels receive a vote. Once all classifiers have cast there vote, the label with the most votes is selected as the label for the test instance. If multiple labels have the same amount of votes, then one of these labels will randomly be selected.

protected double[] distributionForInstanceMajorityVoting(Instance instance) throws Exception {

  double[] probs = new double[instance.classAttribute().numValues()];
  double[] votes = new double[probs.length];

  for (int i = 0; i < m_Classifiers.length; i++) {
    probs = getClassifier(i).distributionForInstance(instance);
    int maxIndex = 0;
    for(int j = 0; j<probs.length; j++) {
      if(probs[j] > probs[maxIndex])
        maxIndex = j;
    }

    // Consider the cases when multiple classes happen to have the same probability
    for (int j=0; j<probs.length; j++) {
      if (probs[j] == probs[maxIndex])
        votes[j]++;
    }
  }

  for (int i = 0; i < m_preBuiltClassifiers.size(); i++) {
    probs = m_preBuiltClassifiers.get(i).distributionForInstance(instance);
    int maxIndex = 0;

    for(int j = 0; j<probs.length; j++) {
      if(probs[j] > probs[maxIndex])
        maxIndex = j;
    }

    // Consider the cases when multiple classes happen to have the same probability
    for (int j=0; j<probs.length; j++) {
      if (probs[j] == probs[maxIndex])
        votes[j]++;
    }
  }

  int tmpMajorityIndex = 0;
  for (int k = 1; k < votes.length; k++) {
    if (votes[k] > votes[tmpMajorityIndex])
      tmpMajorityIndex = k;
  }

  // Consider the cases when multiple classes receive the same amount of votes
  Vector<Integer> majorityIndexes = new Vector<Integer>();
  for (int k = 0; k < votes.length; k++) {
    if (votes[k] == votes[tmpMajorityIndex])
      majorityIndexes.add(k);
   }

  // Resolve the ties according to a uniform random distribution
  int majorityIndex = majorityIndexes.get(m_Random.nextInt(majorityIndexes.size()));

  //set probs to 0
  probs = new double[probs.length];

  probs[majorityIndex] = 1; //the class that have been voted the most receives 1

  return probs;
}
like image 105
Sicco Avatar answered Nov 15 '22 08:11

Sicco