Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

libsvm java implementation

Tags:

java

svm

libsvm

I am trying to use the java bindings for libsvm:

http://www.csie.ntu.edu.tw/~cjlin/libsvm/

I have implemented a 'trivial' example which is easily linearly separable in y. The data is defined as:

double[][] train = new double[1000][]; 
double[][] test = new double[10][];

for (int i = 0; i < train.length; i++){
    if (i+1 > (train.length/2)){        // 50% positive
        double[] vals = {1,0,i+i};
        train[i] = vals;
    } else {
        double[] vals = {0,0,i-i-i-2}; // 50% negative
        train[i] = vals;
    }           
}

Where the first 'feature' is the class and the training set is similarly defined.

To train the model:

private svm_model svmTrain() {
    svm_problem prob = new svm_problem();
    int dataCount = train.length;
    prob.y = new double[dataCount];
    prob.l = dataCount;
    prob.x = new svm_node[dataCount][];     

    for (int i = 0; i < dataCount; i++){            
        double[] features = train[i];
        prob.x[i] = new svm_node[features.length-1];
        for (int j = 1; j < features.length; j++){
            svm_node node = new svm_node();
            node.index = j;
            node.value = features[j];
            prob.x[i][j-1] = node;
        }           
        prob.y[i] = features[0];
    }               

    svm_parameter param = new svm_parameter();
    param.probability = 1;
    param.gamma = 0.5;
    param.nu = 0.5;
    param.C = 1;
    param.svm_type = svm_parameter.C_SVC;
    param.kernel_type = svm_parameter.LINEAR;       
    param.cache_size = 20000;
    param.eps = 0.001;      

    svm_model model = svm.svm_train(prob, param);

    return model;
}

Then to evaluate the model I use:

public int evaluate(double[] features) {
    svm_node node = new svm_node();
    for (int i = 1; i < features.length; i++){
        node.index = i;
        node.value = features[i];
    }
    svm_node[] nodes = new svm_node[1];
    nodes[0] = node;

    int totalClasses = 2;       
    int[] labels = new int[totalClasses];
    svm.svm_get_labels(_model,labels);

    double[] prob_estimates = new double[totalClasses];
    double v = svm.svm_predict_probability(_model, nodes, prob_estimates);

    for (int i = 0; i < totalClasses; i++){
        System.out.print("(" + labels[i] + ":" + prob_estimates[i] + ")");
    }
    System.out.println("(Actual:" + features[0] + " Prediction:" + v + ")");            

    return (int)v;
}

Where the passed array is a point from the testing set.

The results are always returning class 0. With the exact results being:

(0:0.9882998314585194)(1:0.011700168541480586)(Actual:0.0 Prediction:0.0)
(0:0.9883952943701599)(1:0.011604705629839989)(Actual:0.0 Prediction:0.0)
(0:0.9884899803606306)(1:0.011510019639369528)(Actual:0.0 Prediction:0.0)
(0:0.9885838957058696)(1:0.011416104294130458)(Actual:0.0 Prediction:0.0)
(0:0.9886770466322342)(1:0.011322953367765776)(Actual:0.0 Prediction:0.0)
(0:0.9870913229268679)(1:0.012908677073132284)(Actual:1.0 Prediction:0.0)
(0:0.9868781382588805)(1:0.013121861741119505)(Actual:1.0 Prediction:0.0)
(0:0.986661444476744)(1:0.013338555523255982)(Actual:1.0 Prediction:0.0)
(0:0.9864411843906802)(1:0.013558815609319848)(Actual:1.0 Prediction:0.0)
(0:0.9862172999068877)(1:0.013782700093112332)(Actual:1.0 Prediction:0.0)

Can someone explain why this classifier is not working? Is there a step I have messed up, or a step I am missing?

Thanks

like image 981
user1220022 Avatar asked May 29 '12 02:05

user1220022


People also ask

What is LIBSVM used for?

LIBSVM is an integrated software for support vector classification, (C-SVC, nu-SVC), regression (epsilon-SVR, nu-SVR) and distribution estimation (one-class SVM). It supports multi-class classification. Python, R, MATLAB, Perl, Ruby, Weka, Common LISP, CLISP, Haskell, OCaml, LabVIEW, and PHP interfaces.

What is LIBSVM algorithm?

LIBSVM implements the Sequential minimal optimization (SMO) algorithm for kernelized support vector machines (SVMs), supporting classification and regression. LIBLINEAR implements linear SVMs and logistic regression models trained using a coordinate descent algorithm.

What is LIBSVM format?

MLlib supports reading training examples stored in LIBSVM format, which is the default format used by LIBSVM and LIBLINEAR . It is a text format in which each line represents a labeled sparse feature vector using the following format: label index1:value1 index2:value2 ...


1 Answers

it seems to me that your evaluate method is wrong. Should be something like this:

public double evaluate(double[] features, svm_model model) 
{
    svm_node[] nodes = new svm_node[features.length-1];
    for (int i = 1; i < features.length; i++)
    {
        svm_node node = new svm_node();
        node.index = i;
        node.value = features[i];

        nodes[i-1] = node;
    }

    int totalClasses = 2;       
    int[] labels = new int[totalClasses];
    svm.svm_get_labels(model,labels);

    double[] prob_estimates = new double[totalClasses];
    double v = svm.svm_predict_probability(model, nodes, prob_estimates);

    for (int i = 0; i < totalClasses; i++){
        System.out.print("(" + labels[i] + ":" + prob_estimates[i] + ")");
    }
    System.out.println("(Actual:" + features[0] + " Prediction:" + v + ")");            

    return v;
}
like image 89
cosmin Avatar answered Sep 28 '22 10:09

cosmin