I write a classifier (Gaussian Mixture Model) to classify five human actions. For every observation the classifier compute the posterior probability to belong to a cluster. I want to valutate the performance of my system parameterized with a threshold, with values from 0 to 100. For every threshold values, for every observation, if the probability of belonging to one of cluster is greater than threshold I accept the result of the classifier otherwise I discard it. For every threshold values I compute the number of true-positive, true-negative, false-positive, false-negative. Than I compute the two function: sensitivity and specificity as <pre class="prettyprint"><code>sensitivity = TP/(TP+FN); specificity=TN/(TN+FP); </code></pre> In matlab: <pre class="prettyprint"><code>plot(1-specificity,sensitivity); </code></pre> to have the ROC curve. But the result isn't what I expect. This is the plot of the functions of discards, errors, corrects, sensitivity and specificity varying the threshold of one action. <img src="https://i.stack.imgur.com/0JVNU.png" alt="This is the plot of the functions of discards, errors, corrects, sensitivity and specificity varying the threshold"> This is the plot of ROC curve of one action <img src="https://i.stack.imgur.com/X2MXD.png" alt="This is the plot of ROC curve"> This is the stem of ROC curve for the same action <img src="https://i.stack.imgur.com/DpaO2.png" alt="enter image description here"> I am wrong, but i don't know where. Perhaps I do wrong the calculating of FP, FN, TP, TN especially when the result of the classifier is minor of the threshold, so I have a discard. What I have to incremente when there is a discard?

Background I am answering this because I need to work through the content, and a question like this is a great excuse. Thank you for the good opportunity. I use data from the built-in fisher iris data: http://archive.ics.uci.edu/ml/datasets/Iris I also use code snippets from the Mathworks tutorial on the classification, and for plotroc <ul> <li>http://www.mathworks.com/products/demos/statistics/classdemo.html</li> <li>http://www.mathworks.com/help/nnet/ref/plotroc.html?searchHighlight=plotroc</li> </ul> Problem Description There is clearer boundary within the domain to classify "setosa" but there is overlap for "versicoloir" vs. "virginica". This is a two dimensional plot, and some of the other information has been discarded to produce it. The ambiguity in the classification boundaries is a useful thing in this case. <pre class="prettyprint"><code>%load data load fisheriris %show raw data figure(1); clf gscatter(meas(:,1), meas(:,2), species,'rgb','osd'); xlabel('Sepal length'); ylabel('Sepal width'); axis equal axis tight title('Raw Data') </code></pre> <img src="https://i.stack.imgur.com/Hv4EE.png" alt="display of the data"> Analysis Lets say that we want to determine the bounds for a linear classifier that defines "virginica" versus "non-virginica". We could look at "self vs. not-self" for other classes, but they would have their own So now we make some linear discriminants and plot the ROC for them: <pre class="prettyprint"><code>%load data load fisheriris load iris_dataset irisInputs=meas(:,1:2)'; irisTargets=irisTargets(3,:); ldaClass1 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'linear')'; ldaClass2 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'diaglinear')'; ldaClass3 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'quadratic')'; ldaClass4 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'diagquadratic')'; ldaClass5 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'mahalanobis')'; myinput=repmat(irisTargets,5,1); myoutput=[ldaClass1;ldaClass2;ldaClass3;ldaClass4;ldaClass5]; whos plotroc(myinput,myoutput) </code></pre> The result is shown in the following, though it took deleting repeat copies of the diagonal: <img src="https://i.stack.imgur.com/ViBoP.png" alt="enter image description here"> You can note in the code that I stack "myinput" and "myoutput" and feed them as inputs into the "plotroc" function. You should take the results of your classifier as targets and actuals and you can get similar results. This compares the actual output of your classifier versus the ideal output of your target values. Those are the input to plotroc. So this will give you "built-in" ROC, which is useful for quick work, but does not make you learn every step in detail. Questions you can ask at this point include: <ul> <li>which classifier is best? How do I determine what best is in this case?</li> <li>What is the convex hull of the classifiers? Is there some mixture of classifiers that is more informative than any pure method? Bagging perhaps?</li> </ul>

how to calculate roc curves?

Tags:

classification

matlab

roc

false-positive

threshold

I write a classifier (Gaussian Mixture Model) to classify five human actions. For every observation the classifier compute the posterior probability to belong to a cluster.

I want to valutate the performance of my system parameterized with a threshold, with values from 0 to 100. For every threshold values, for every observation, if the probability of belonging to one of cluster is greater than threshold I accept the result of the classifier otherwise I discard it.

For every threshold values I compute the number of true-positive, true-negative, false-positive, false-negative.

Than I compute the two function: sensitivity and specificity as

sensitivity = TP/(TP+FN);

specificity=TN/(TN+FP);

In matlab:

plot(1-specificity,sensitivity);

to have the ROC curve. But the result isn't what I expect.

This is the plot of the functions of discards, errors, corrects, sensitivity and specificity varying the threshold of one action.

This is the plot of the functions of discards, errors, corrects, sensitivity and specificity varying the threshold

This is the plot of ROC curve of one action

This is the stem of ROC curve for the same action enter image description here

I am wrong, but i don't know where. Perhaps I do wrong the calculating of FP, FN, TP, TN especially when the result of the classifier is minor of the threshold, so I have a discard. What I have to incremente when there is a discard?

411

asked Oct 19 '12 18:10

Mario Lepore

1 Answers

Background

I am answering this because I need to work through the content, and a question like this is a great excuse. Thank you for the good opportunity.

I use data from the built-in fisher iris data: http://archive.ics.uci.edu/ml/datasets/Iris

I also use code snippets from the Mathworks tutorial on the classification, and for plotroc

http://www.mathworks.com/products/demos/statistics/classdemo.html
http://www.mathworks.com/help/nnet/ref/plotroc.html?searchHighlight=plotroc

Problem Description

There is clearer boundary within the domain to classify "setosa" but there is overlap for "versicoloir" vs. "virginica". This is a two dimensional plot, and some of the other information has been discarded to produce it. The ambiguity in the classification boundaries is a useful thing in this case.

%load data
load fisheriris

%show raw data
figure(1); clf
gscatter(meas(:,1), meas(:,2), species,'rgb','osd');
xlabel('Sepal length');
ylabel('Sepal width');
axis equal
axis tight
title('Raw Data')

display of the data

Analysis

Lets say that we want to determine the bounds for a linear classifier that defines "virginica" versus "non-virginica". We could look at "self vs. not-self" for other classes, but they would have their own

So now we make some linear discriminants and plot the ROC for them:

%load data
load fisheriris
load iris_dataset

irisInputs=meas(:,1:2)';
irisTargets=irisTargets(3,:);

ldaClass1 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'linear')';
ldaClass2 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'diaglinear')';
ldaClass3 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'quadratic')';
ldaClass4 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'diagquadratic')';
ldaClass5 = classify(meas(:,1:2),meas(:,1:2),irisTargets,'mahalanobis')';

myinput=repmat(irisTargets,5,1);
myoutput=[ldaClass1;ldaClass2;ldaClass3;ldaClass4;ldaClass5];
whos
plotroc(myinput,myoutput)

The result is shown in the following, though it took deleting repeat copies of the diagonal:

enter image description here

You can note in the code that I stack "myinput" and "myoutput" and feed them as inputs into the "plotroc" function. You should take the results of your classifier as targets and actuals and you can get similar results. This compares the actual output of your classifier versus the ideal output of your target values. Those are the input to plotroc.

So this will give you "built-in" ROC, which is useful for quick work, but does not make you learn every step in detail.

Questions you can ask at this point include:

which classifier is best? How do I determine what best is in this case?
What is the convex hull of the classifiers? Is there some mixture of classifiers that is more informative than any pure method? Bagging perhaps?

195

answered Oct 02 '22 06:10

EngrStudent

Related questions
                            
                                What extra data is stored by an anonymous function?
                            
                                Analytical way of speeding up exp(A*x) in MATLAB
                            
                                How to blend properly when stitching images in matlab?
                            
                                Backpropagation for rectified linear unit activation with cross entropy error
                            
                                What is the [Sci/Num]Python equivalent to Matlabs "norminv" (Normal inverse cumulative distribution function) [duplicate]
                            
                                Eigen + MKL slower than Matlab for matrix multiplication
                            
                                Matlab numerictype/reinterpretcast equivalent in python?
                            
                                Matlab dependency management
                            
                                Generalized Matrix Product
                            
                                Can you limit what characters can be typed into a MATLAB GUI editbox?
                            
                                Arabic text messages in matlab
                            
                                MATLAB: GUI progressively getting slower
                            
                                MATLAB remove ticks on one axis while keeping labels
                            
                                How to identify places in MATLAB where data is stored outside the bounds of an array?
                            
                                Finding a "movement direction" (angle) of a point
                            
                                measure two different (vector) signal similarity
                            
                                Rotating an image without the Image Processing Toolbox
                            
                                Anyone used the MATLAB tool to produce C/C++ code? Is the resulting code viable for production use?
                            
                                Matlab filter() with SciPy lfilter()
                            
                                .NET performance from Matlab

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With