Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accord.net NaiveBayesLearning "Index was outside the bounds of the array"

I am using Accord.net 3.7.0 in dot net core 1.1.

The algorithm I use is naive bayesian. And the source code of the learning mechanism is as follows:

    public LearningResultViewModel NaiveBayes(int[][] inputs, int[] outputs)
    {
        // Create a new Naive Bayes learning
        var learner = new NaiveBayesLearning();

        // Learn a Naive Bayes model from the examples
        NaiveBayes nb = learner.Learn(inputs, outputs);

        #region test phase
        // Compute the machine outputs
        int[] predicted = nb.Decide(inputs);

        // Use confusion matrix to compute some statistics.
        ConfusionMatrix confusionMatrix = new ConfusionMatrix(predicted, outputs, 1, 0);
        #endregion

        LearningResultViewModel result = new LearningResultViewModel()
        {
            Distributions = nb.Distributions,
            NumberOfClasses = nb.NumberOfClasses,
            NumberOfInputs = nb.NumberOfInputs,
            NumberOfOutputs = nb.NumberOfOutputs,
            NumberOfSymbols = nb.NumberOfSymbols,
            Priors = nb.Priors,
            confusionMatrix = confusionMatrix
        };

        return result;
    }

I have tested this piece of code on a little data but as data grew the

Index was outside the bounds of the array

error occurred.

As I can't navigate in the Learn method so I don't know what to do. the screen shot of the run-time is this:

Run-time error screen shot

No extra information, no inner exception no IDEA!!!

TG.

// UPDATE_1 ***

The inputs array is a 180 by 4 matrix (array) as the bellow image shows:

Inputs

which has 4 columns in every row. checked by hand (I can share its video too if needed!!!)

The outputs array is a 180 one as shown here:

Outputs

which only contains 0 and 1 (I can share its video too if needed!!!).

And about NaiveBayesinLearning doc is here:

NaiveBayesinLearning

More examples bottom of this page:

More examples

And the learn method docs here:

learn method doc

like image 881
ConductedClever Avatar asked Aug 22 '17 17:08

ConductedClever


1 Answers

According to the comments and the Ideas from them I have suspected to the values of matrix. So I have investigated it:

problem

As shown in image above, some rows have below zero values. The inputs matrix is generated by Codification which is used in the examples of here:

NaiveBayes

with the docs below:

Codification docs

the codification -1 was with the values of null. Like the screen shot below:

one of problematic records

So my solution was replacing null values with "null". But may be there is better solutions.

Now the caller method that contains fixed data is as follows:

    public LearningResultViewModel Learn(EMVDBContext dBContext, string userId, LearningAlgorithm learningAlgorithm)
    {
        var learningDataRaw = dBContext.Mutants
            .Include(mu => mu.MutationOperator)
            .Where(mu => mu.Equivalecy == 0 || mu.Equivalecy == 10);

        string[] featureTitles = new string[] {
        "ChangeType",
        "OperatorName",
        "OperatorBefore",
        "OperatorAfter",
        };

        string[][] learningInputNotCodified = learningDataRaw.Select(ldr => new string[] {
            ldr.ChangeType.ToString(),
            ldr.MutationOperator.Name??"null",
            ldr.MutationOperator.Before??"null",
            ldr.MutationOperator.After??"null",
        }).ToArray();

        int[] learningOutputNotCodified = learningDataRaw.Select(ldr => ldr.Equivalecy == 0 ? 0 : 1).ToArray();

        #region Codification phase
        // Create a new codification codebook to
        // convert strings into discrete symbols
        Codification codebook = new Codification(featureTitles, learningInputNotCodified);

        // Extract input and output pairs to train
        int[][] learningInput = codebook.Transform(learningInputNotCodified);

        switch (learningAlgorithm)
        {
            case LearningAlgorithm.NaiveBayesian:
                return learningService.NaiveBayes(learningInput, learningOutputNotCodified);
                break;
            case LearningAlgorithm.SVM:
                break;
            default:
                break;
        }
        #endregion

        return null;
    }

I wish this will help the others encountering same problem.

like image 170
ConductedClever Avatar answered Sep 28 '22 08:09

ConductedClever