I have a multi output(200) binary classification model which I wrote in keras.
In this model I want to add additional metrics such as ROC and AUC but to my knowledge keras dosen't have in-built ROC and AUC metric functions.
I tried to import ROC, AUC functions from scikit-learn
from sklearn.metrics import roc_curve, auc from keras.models import Sequential from keras.layers import Dense . . . model.add(Dense(200, activation='relu')) model.add(Dense(300, activation='relu')) model.add(Dense(400, activation='relu')) model.add(Dense(300, activation='relu')) model.add(Dense(200,init='normal', activation='softmax')) #outputlayer model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy','roc_curve','auc'])
but it's giving this error:
Exception: Invalid metric: roc_curve
How should I add ROC, AUC to keras?
ROC AUC is the area under the ROC curve and is often used to evaluate the ordering quality of two classes of objects by an algorithm. It is clear that this value lies in the [0,1] segment. In our example, ROC AUC value = 9.5/12 ~ 0.79.
An ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds. This curve plots two parameters: True Positive Rate. False Positive Rate.
The AUC (Area under the curve) of the ROC (Receiver operating characteristic; default) or PR (Precision Recall) curves are quality measures of binary classifiers. Unlike the accuracy, and like cross-entropy losses, ROC-AUC and PR-AUC evaluate all the operational points of a model.
How do AUC ROC plots work for multiclass models? For multiclass problems, ROC curves can be plotted with the methodology of using one class versus the rest. Use this one-versus-rest for each class and you will have the same number of curves as classes. The AUC score can also be calculated for each class individually.
Due to that you can't calculate ROC&AUC by mini-batches, you can only calculate it on the end of one epoch. There is a solution from jamartinh, I patch the codes below for convenience:
from sklearn.metrics import roc_auc_score from keras.callbacks import Callback class RocCallback(Callback): def __init__(self,training_data,validation_data): self.x = training_data[0] self.y = training_data[1] self.x_val = validation_data[0] self.y_val = validation_data[1] def on_train_begin(self, logs={}): return def on_train_end(self, logs={}): return def on_epoch_begin(self, epoch, logs={}): return def on_epoch_end(self, epoch, logs={}): y_pred_train = self.model.predict_proba(self.x) roc_train = roc_auc_score(self.y, y_pred_train) y_pred_val = self.model.predict_proba(self.x_val) roc_val = roc_auc_score(self.y_val, y_pred_val) print('\rroc-auc_train: %s - roc-auc_val: %s' % (str(round(roc_train,4)),str(round(roc_val,4))),end=100*' '+'\n') return def on_batch_begin(self, batch, logs={}): return def on_batch_end(self, batch, logs={}): return roc = RocCallback(training_data=(X_train, y_train), validation_data=(X_test, y_test)) model.fit(X_train, y_train, validation_data=(X_test, y_test), callbacks=[roc])
A more hackable way using tf.contrib.metrics.streaming_auc
:
import numpy as np import tensorflow as tf from sklearn.metrics import roc_auc_score from sklearn.datasets import make_classification from keras.models import Sequential from keras.layers import Dense from keras.utils import np_utils from keras.callbacks import Callback, EarlyStopping # define roc_callback, inspired by https://github.com/keras-team/keras/issues/6050#issuecomment-329996505 def auc_roc(y_true, y_pred): # any tensorflow metric value, update_op = tf.contrib.metrics.streaming_auc(y_pred, y_true) # find all variables created for this metric metric_vars = [i for i in tf.local_variables() if 'auc_roc' in i.name.split('/')[1]] # Add metric variables to GLOBAL_VARIABLES collection. # They will be initialized for new session. for v in metric_vars: tf.add_to_collection(tf.GraphKeys.GLOBAL_VARIABLES, v) # force to update metric values with tf.control_dependencies([update_op]): value = tf.identity(value) return value # generation a small dataset N_all = 10000 N_tr = int(0.7 * N_all) N_te = N_all - N_tr X, y = make_classification(n_samples=N_all, n_features=20, n_classes=2) y = np_utils.to_categorical(y, num_classes=2) X_train, X_valid = X[:N_tr, :], X[N_tr:, :] y_train, y_valid = y[:N_tr, :], y[N_tr:, :] # model & train model = Sequential() model.add(Dense(2, activation="softmax", input_shape=(X.shape[1],))) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy', auc_roc]) my_callbacks = [EarlyStopping(monitor='auc_roc', patience=300, verbose=1, mode='max')] model.fit(X, y, validation_split=0.3, shuffle=True, batch_size=32, nb_epoch=5, verbose=1, callbacks=my_callbacks) # # or use independent valid set # model.fit(X_train, y_train, # validation_data=(X_valid, y_valid), # batch_size=32, nb_epoch=5, verbose=1, # callbacks=my_callbacks)
Like you, I prefer using scikit-learn's built in methods to evaluate AUROC. I find that the best and easiest way to do this in keras is to create a custom metric. If tensorflow is your backend, implementing this can be done in very few lines of code:
import tensorflow as tf from sklearn.metrics import roc_auc_score def auroc(y_true, y_pred): return tf.py_func(roc_auc_score, (y_true, y_pred), tf.double) # Build Model... model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy', auroc])
Creating a custom Callback as mentioned in other answers will not work for your case since your model has multiple ouputs, but this will work. Additionally, this methods allows the metric to be evaluated on both training and validation data whereas a keras callback does not have access to the training data and can thus only be used to evaluate performance on the training data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With