Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I calculate the matthews correlation coefficient in tensorflow

So I made a model with tensorflow keras and it seems to work ok. However, my supervisor said it would be useful to calculate the Matthews correlation coefficient, as well as the accuracy and loss it already calculates.

my model is very similar to the code in the tutorial here (https://www.tensorflow.org/tutorials/keras/basic_classification) except with a much smaller dataset.

is there a prebuilt function or would I have to get the prediction for each test and calculate it by hand?

like image 296
Toby Peterken Avatar asked Jul 03 '19 07:07

Toby Peterken


People also ask

How do you find the Matthew correlation coefficient in Python?

To calculate the MCC of the model, we can use the following formula: MCC = (TP*TN – FP*FN) / √(TP+FP)(TP+FN)(TN+FP)(TN+FN) MCC = (15*375-5*5) / √(15+5)(15+5)(375+5)(375+5) MCC = 0.7368.

What is a good Matthews Correlation Coefficient?

A Matthews correlation coefficient close to +1, in fact, means having high values for all the other confusion matrix metrics. The same cannot be said for balanced accuracy, markedness, bookmaker informedness, accuracy and F1 score.


2 Answers

There is nothing out of the box but we can calculate it from the formula in a custom metric.

The basic classification link you supplied is for a multi-class categorisation problem whereas the Matthews Correlation Coefficient is specifically for binary classification problems.

Assuming your model is structured in the "normal" way for such problems (i.e. y_pred is a number between 0 and 1 for each record representing predicted probability of a "True" and labels are each exactly a 0 or 1 representing ground truth "False" and "True" respectively) then we can add in an MCC metric as follows:

# if y_pred > threshold we predict true. 
# Sometimes we set this to something different to 0.5 if we have unbalanced categories

threshold = 0.5  

def mcc_metric(y_true, y_pred):
  predicted = tf.cast(tf.greater(y_pred, threshold), tf.float32)
  true_pos = tf.math.count_nonzero(predicted * y_true)
  true_neg = tf.math.count_nonzero((predicted - 1) * (y_true - 1))
  false_pos = tf.math.count_nonzero(predicted * (y_true - 1))
  false_neg = tf.math.count_nonzero((predicted - 1) * y_true)
  x = tf.cast((true_pos + false_pos) * (true_pos + false_neg) 
      * (true_neg + false_pos) * (true_neg + false_neg), tf.float32)
  return tf.cast((true_pos * true_neg) - (false_pos * false_neg), tf.float32) / tf.sqrt(x)

which we can include in our model.compile call:

model.compile(optimizer='adam',
              loss=tf.keras.losses.binary_crossentropy,
              metrics=['accuracy', mcc_metric])

Example

Here is a complete worked example where we categorise mnist digits depending on whether they are greater than 4:

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
y_train, y_test = 0 + (y_train > 4), 0 + (y_test > 4)

def mcc_metric(y_true, y_pred):
  predicted = tf.cast(tf.greater(y_pred, 0.5), tf.float32)
  true_pos = tf.math.count_nonzero(predicted * y_true)
  true_neg = tf.math.count_nonzero((predicted - 1) * (y_true - 1))
  false_pos = tf.math.count_nonzero(predicted * (y_true - 1))
  false_neg = tf.math.count_nonzero((predicted - 1) * y_true)
  x = tf.cast((true_pos + false_pos) * (true_pos + false_neg) 
      * (true_neg + false_pos) * (true_neg + false_neg), tf.float32)
  return tf.cast((true_pos * true_neg) - (false_pos * false_neg), tf.float32) / tf.sqrt(x)

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='relu'),
  tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.binary_crossentropy,
              metrics=['accuracy', mcc_metric])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

output:

Epoch 1/5
60000/60000 [==============================] - 7s 113us/sample - loss: 0.1391 - acc: 0.9483 - mcc_metric: 0.8972
Epoch 2/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.0722 - acc: 0.9747 - mcc_metric: 0.9495
Epoch 3/5
60000/60000 [==============================] - 6s 97us/sample - loss: 0.0576 - acc: 0.9797 - mcc_metric: 0.9594
Epoch 4/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.0479 - acc: 0.9837 - mcc_metric: 0.9674
Epoch 5/5
60000/60000 [==============================] - 6s 95us/sample - loss: 0.0423 - acc: 0.9852 - mcc_metric: 0.9704
10000/10000 [==============================] - 1s 58us/sample - loss: 0.0582 - acc: 0.9818 - mcc_metric: 0.9639
[0.05817381642502733, 0.9818, 0.9638971]
like image 195
Stewart_R Avatar answered Nov 03 '22 00:11

Stewart_R


Since the asker accepted a Python version from sklearn, here is Stewart_Rs answer in pure Python:

from math import sqrt
def mcc(tp, fp, tn, fn):

    # https://stackoverflow.com/a/56875660/992687
    x = (tp + fp) * (tp + fn) * (tn + fp) * (tn + fn)
    return ((tp * tn) - (fp * fn)) / sqrt(x)

It has the advanatage of being general, not just for evaluating binary classifications.

like image 29
The Unfun Cat Avatar answered Nov 03 '22 01:11

The Unfun Cat