Simple text classification using naive bayes (weka) in java

Tags:

I try to do text classification naive bayes weka libarary in my java code, but i think the result of the classification is not correct, i don't know what's the problem. I use arff file for the input.

this is my training data:

@relation hamspam

@attribute text string
@attribute class {spam,ham}

@data
'good',ham
'good',ham
'very good',ham
'bad',spam
'very bad',spam
'very bad, very bad',spam
'good good bad',ham

this is my testing_data:

@relation test

@attribute text string
@attribute class {spam,ham}

@data
'good bad very bad',?
'good bad very bad',?
'good',?
'good very good',?
'bad',?
'very good',?
'very very good',?

and this is my code:

public static void NaiveBayes(String training_file, String testing_file) throws FileNotFoundException, IOException, Exception{
         //filter
        StringToWordVector filter = new StringToWordVector();

        Classifier naive = new NaiveBayes();

        //training data
        Instances train = new Instances(new BufferedReader(new FileReader(training_file)));
        int lastIndex = train.numAttributes() - 1;
        train.setClassIndex(lastIndex);
        filter.setInputFormat(train);
        train = Filter.useFilter(train, filter);

        //testing data
        Instances test = new Instances(new BufferedReader(new FileReader(testing_file)));
        test.setClassIndex(lastIndex);
        filter.setInputFormat(test);
        Instances test2 = Filter.useFilter(test, filter);

        naive.buildClassifier(train);

        for(int i=0; i<test2.numInstances(); i++) {
            System.out.println(test.instance(i));
            double index = naive.classifyInstance(test2.instance(i));
            String className = train.attribute(0).value((int)index);
            System.out.println(className);
        }
    }

The result indicate that the data that should have been classified into class spam classified into class ham, and the data that should have been classified into class ham classified into class spam. what's the problem?, help me please..

863

asked Jan 30 '17 11:01

Muhammad Haryadi Futra

1 Answers

Your code seems fine, though i have two comments to make.

First, you set filter's format with this command filter.setInputFormat(train); so as to use this filter and make test and train data compatible. You should not change the format again with this command: filter.setInputFormat(test); as this might create compatibility issues.
Also instead of getting the first attribute: train.attribute(0).value((int)index); (which seems to me that is not corresponds to class attribute) try using this command train.classAttribute().value((int)index);

P.S. Check Load naïve Bayes model in Java code using weka jar for a complete workflow and explanation of a classification example (the material was once in SO Documentation). This example is using the LibLinear classifier but the logic is the same.

104

answered Oct 13 '22 17:10

xro7

Related questions
                            
                                Spring Boot integration tests cannot reach application.properties file
                            
                                How to retrive string value of Long.MAX_VALUE in compile time in java?
                            
                                Conditionally getting json data using java
                            
                                removing duplicates in java on large scale data
                            
                                Why does this compile? Type erasure [duplicate]
                            
                                selenium installation hurdle "importfirefoxdriver"
                            
                                How to change map type from simple to satellite of PlacePicker from Google Places API?
                            
                                mvn package load other Library as Eclipse
                            
                                Java 8 Stream (based on resource) .iterator() that auto-closes the resource?
                            
                                jOOQ mapper from POJO to Record
                            
                                Convert docx file into PDF with Java
                            
                                How to override overloaded methods, where one is deprecated, for backwards compatibility?
                            
                                How to provide Password for sudo command in java [closed]
                            
                                Read Excel file in android.java
                            
                                Android Annotations - Injecting a list of superclass type
                            
                                How to add the middle Button in BottomBar layout android
                            
                                Pack header and data layout in one byte array using ByteBuffer in an efficient way?
                            
                                Cannot use lambda functions in Android with Java 1.8
                            
                                ZonedDateTime persistance to SQL Database
                            
                                Programmatically get hint text value in android Edittext [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Simple text classification using naive bayes (weka) in java

Tags:

java

naivebayes

text-classification

weka

arff

Muhammad Haryadi Futra

People also ask

1 Answers

xro7

Recent Activity

Donate For Us