Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does binary accuracy give high accuracy while categorical accuracy give low accuracy, in a multi-class classification problem?

I'm working on a multiclass classification problem using Keras and I'm using binary accuracy and categorical accuracy as metrics. When I evaluate my model I get a really high value for the binary accuracy and quite a low one in for the categorical accuracy. I tried to recreate the binary accuracy metric in my own code but I am not having much luck. My understanding is that this is the process I need to recreate:

def binary_accuracy(y_true, y_pred):
     return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)

Here is my code:

from keras import backend as K
preds = model.predict(X_test, batch_size = 128)

print preds
pos = 0.00
neg = 0.00

for i, val in enumerate(roundpreds):

    if val.tolist() == y_test[i]:
        pos += 1.0

    else: 
        neg += 1.0

print pos/(pos + neg)

But this gives a much lower value than the one given by binary accuracy. Is binary accuracy even an appropriate metric to be using in a multi-class problem? If so does anyone know where I am going wrong?

like image 216
Ninja Avatar asked Sep 21 '17 22:09

Ninja


People also ask

What is the difference between binary accuracy and accuracy?

Accuracy class Calculates how often predictions equal labels. This metric creates two local variables, total and count that are used to compute the frequency with which y_pred matches y_true . This frequency is ultimately returned as binary accuracy : an idempotent operation that simply divides total by count .

What is good accuracy for multiclass classification?

Generally, values over 0.7 are considered good scores. BTW, the above formula was for the binary classifiers. For multiclass, Sklearn gives an even more monstrous formula: Image by Sklearn.

What the difference between accuracy and categorical accuracy?

Accuracy is a simple comparison between how many target values match the predicted values. Categorical Accuracy on the other hand calculates the percentage of predicted values (yPred) that match with actual values (yTrue) for one-hot labels.

Why accuracy is not a good measure for multiclass classification?

Problem with accuracy: Multi-class target variable: When your data has more than 2 classes. With 3 or more classes you may get a classification accuracy of 80%, but you don't know if that is because all classes are being predicted equally well or whether one or two classes are being neglected by the model.


1 Answers

So you need to understand what happens when you apply a binary_crossentropy to a multiclass prediction.

  1. Let's assume that your output from softmax is (0.1, 0.2, 0.3, 0.4) and one-hot encoded ground truth is (1, 0, 0, 0).
  2. binary_crossentropy masks all outputs which are higher than 0.5 so out of your network is turned to (0, 0, 0, 0) vector.
  3. (0, 0, 0, 0) matches ground truth (1, 0, 0, 0) on 3 out of 4 indexes - this makes resulting accuracy to be at the level of 75% for a completely wrong answer!

To solve this you could use a single class accuracy, e.g. like this one:

def single_class_accuracy(interesting_class_id):
    def fn(y_true, y_pred):
        class_id_preds = K.argmax(y_pred, axis=-1)
        # Replace class_id_preds with class_id_true for recall here
        positive_mask = K.cast(K.equal(class_id_preds, interesting_class_id), 'int32')
        true_mask = K.cast(K.equal(y_true, interesting_class_id), 'int32')
        acc_mask = K.cast(K.equal(positive_mask, true_mask), 'float32')
        class_acc = K.mean(acc_mask)
        return class_acc

    return fn
like image 100
Marcin Możejko Avatar answered Sep 25 '22 08:09

Marcin Możejko