I would like to calculate NN model certainty/confidence (see What my deep model doesn't know) - when NN tells me an image represents "8", I would like to know how certain it is. Is my model 99% certain it is "8" or is it 51% it is "8", but it could also be "6"? Some digits are quite ambiguous and I would like to know for which images the model is just "flipping a coin".
I have found some theoretical writings about this but I have trouble putting this in code. If I understand correctly, I should evaluate a testing image multiple times while "killing off" different neurons (using dropout) and then...?
Working on MNIST dataset, I am running the following model:
from keras.models import Sequential from keras.layers import Dense, Activation, Conv2D, Flatten, Dropout model = Sequential() model.add(Conv2D(128, kernel_size=(7, 7), activation='relu', input_shape=(28, 28, 1,))) model.add(Dropout(0.20)) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(Dropout(0.20)) model.add(Flatten()) model.add(Dense(units=64, activation='relu')) model.add(Dropout(0.25)) model.add(Dense(units=10, activation='softmax')) model.summary() model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy']) model.fit(train_data, train_labels, batch_size=100, epochs=30, validation_data=(test_data, test_labels,))
How should I predict with this model so that I get its certainty about predictions too? I would appreciate some practical examples (preferably in Keras, but any will do).
To clarify, I am looking for an example of how to get certainty using the method outlined by Yurin Gal (or an explanation of why some other method yields better results).
Prediction uncertainty refers to the variability in prediction due to plausible alternative input values. The uncertainty about appropriate input values described by probability distributions propagates through the model to form a probability distribution for model prediction.
predict passes the input vector through the model and returns the output tensor for each datapoint. Since the last layer in your model is a single Dense neuron, the output for any datapoint is a single value. And since you didn't specify an activation for the last layer, it will default to linear activation.
If you want to implement dropout approach to measure uncertainty you should do the following:
Implement function which applies dropout also during the test time:
import keras.backend as K f = K.function([model.layers[0].input, K.learning_phase()], [model.layers[-1].output])
Use this function as uncertainty predictor e.g. in a following manner:
def predict_with_uncertainty(f, x, n_iter=10): result = numpy.zeros((n_iter,) + x.shape) for iter in range(n_iter): result[iter] = f(x, 1) prediction = result.mean(axis=0) uncertainty = result.var(axis=0) return prediction, uncertainty
Of course you may use any different function to compute uncertainty.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With