Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras custom metric sum is wrong

I tried implementing precision and recall as custom metrics as in https://datascience.stackexchange.com/questions/45165/how-to-get-accuracy-f1-precision-and-recall-for-a-keras-model/45166#45166?newreg=6190503b2be14e8aa2c0069d0a52749e, but for some reason the numbers were off (I do know about the average of batch problem, that's not what I'm talking about).

So I tried implementing another metric:

def p1(y_true, y_pred):
    return K.sum(y_true)

Just to see what would happen... What I'd expect is to see a straight line chart with the number of 1's I have in my dataset (I'm working on a binary classification problem with binary_crossentropy loss).

Because Keras computes custom metrics as averages of the results for each batch, if I have a batch of size 32, I'd expect this p1 metrics to return 16, but instead I got 15. If I use a batch of size 16, I get something close to 7.9. That was when I tried with the fit method.

I also calculated the validation precision manually after training the model and it does give me a different number than what I see as the last val_precision from history. That was using fir_generator, in which case batch_size is not provided, so I'm assuming it calculates the metric for the entire validation dataset at once.

Another important detail is that when I use the same dataset for training and validation, even when I get the same numbers for true positives and predicted positives at the last epoch, the training and validation precisions are different (1 and 0.6).

true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))

Apparently 32.0 / (32.0 + K.epsilon()) = 0.6000000238418579

Any idea what's wrong?

Something that might help:

enter image description here

def p1(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    return 1.0 / (true_positives + K.epsilon())

def p2(y_true, y_pred):
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    return 1.0 / (predicted_positives + K.epsilon())

def p3(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    return true_positives

def p4(y_true, y_pred):
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    return predicted_positives
like image 368
Rodrigo Ruiz Avatar asked Jan 13 '20 22:01

Rodrigo Ruiz


1 Answers

Honestly, I have run into the same problem at a point and to me, the best solution was to use Recall and Precision from built-in metrics.

Starting with TensorFlow 2.0, these two metrics are built-in tensorflow.keras.metrics, and they work well provided that you use binary_crossentropy with a Dense(1) at the final layer(in the end they are metrics for binary classification of course).

The main thing(important to note) is that the implementation is completely different than what you try to achieve and what was in Keras before.

In fact, in Keras 1.X version, all those metrics were available(F1-Score,Recall and Precision), but they were removed starting from Keras 2.X due to the fact that batch-wise estimation is not relevant for global estimation of these metrics.

According to Francois Chollet (March 19th 2017) (https://github.com/keras-team/keras/issues/5794):

Basically these are all global metrics that were approximated batch-wise, which is more misleading than helpful. This was mentioned in the docs but it's much cleaner to remove them altogether. It was a mistake to merge them in the first place.

However, in TensorFlow 2.0(tensorflow.keras.metrics), they use specialised built-in accumulators, and the computations are made properly, thus being relevant for your dataset. You can find a more detail description here:

https://www.tensorflow.org/api_docs/python/tf/keras/metrics/Recall?version=stable

My strong recommendation: use the built-in metrics, and skip implementing them by hand, particularly since you would naturally batch-wise implement them.

If you have issues with loading the model, please ensure the following:

  • Ensure that you have Python 3 installed(>=3.6.X)
  • If the issue persists, then ensure that custom information is passed to load_model by consulting the following snippet:

      metric_config_dict = {
           'precision': precision
       }
    
       model = tensorflow.keras.models.load_model('path_to_my_model.hdf5',custom_objects= metric_config_dict)
    

Francois Chollet on release of Keras 2.3.0 :

Keras 2.3.0 is the first release of multi-backend Keras that supports TensorFlow 2.0. It maintains compatibility with TensorFlow 1.14, 1.13, as well as Theano and CNTK.

This release brings the API in sync with the tf.keras API as of TensorFlow 2.0. However note that it does not support most TensorFlow 2.0 features, in particular eager execution. If you need these features, use tf.keras.

This is also the last major release of multi-backend Keras. Going forward, we recommend that users consider switching their Keras code to tf.keras in TensorFlow 2.0. It implements the same Keras 2.3.0 API (so switching should be as easy as changing the Keras import statements), but it has many advantages for TensorFlow users, such as support for eager execution, distribution, TPU training, and generally far better integration between low-level TensorFlow and high-level concepts like Layer and Model. It is also better maintained.

Development will focus on tf.keras going forward. We will keep maintaining multi-backend Keras over the next 6 months, but we will only be merging bug fixes. API changes will not be ported

Therefore, even the creator of Keras recommends that we switch to tf.keras instead of plain keras. Please also switch in your code and check if the problems still persist. If you mix tf.keras and keras, you will get all sorts of odd errors; thus change all your imports to tf.keras. For more information w.r.t TensorFlow 2.0 and more changes, you can consult this: https://www.pyimagesearch.com/2019/10/21/keras-vs-tf-keras-whats-the-difference-in-tensorflow-2-0/

like image 131
Timbus Calin Avatar answered Oct 16 '22 14:10

Timbus Calin