Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow estimator: Switching to careful_interpolation to get the correct PR-AUC of a model

In my project, I am using the premade estimator DNNClassifier. Here is my estimator:

model = tf.estimator.DNNClassifier(
        hidden_units=network,
        feature_columns=feature_cols,
        n_classes= 2,
        activation_fn=tf.nn.relu,
        optimizer=tf.train.ProximalAdagradOptimizer(
            learning_rate=0.1,
            l1_regularization_strength=0.001
        ),
        config=chk_point_run_config,
        model_dir=MODEL_CHECKPOINT_DIR
    )

when I evaluate the model using eval_res = model.evaluate(..), I get the following warning:

WARNING:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

How I can switch to careful_interpolation to get the correct results from the evaluate() method?

Tensorflow version: 1.8

like image 407
Ashiqur Rahman Avatar asked Mar 06 '23 00:03

Ashiqur Rahman


1 Answers

Unfortunately, the use of a pre-made estimator leaves little freedom for customizing the evaluation process. Currently, a DNNClassifier does not seem to provide a means to adjust the evaluation metrics, likewise for other estimators.

Albeit not ideal, one solution is to augment an estimator with the desired metrics using tf.contrib.metrics.add_metrics, which will replace the old metric if the exact same key is assigned to the new one:

If there is a name conflict between this and estimators existing metrics, this will override the existing one.

It comes with the advantage of working for any estimator that produces probabilistic predictions, at the expense of still calculating the overridden metric for each evaluation. A DNNClassifier estimator provides logistic values (between 0 and 1) under the key 'logistic' (the list of possible keys in canned estimators are here). This might not always be the case for other estimator heads, but alternatives may be available: in a multi-label classifier built with tf.contrib.estimator.multi_label_head, logistic is not available, but probabilities can be used instead.

Hence, the code would look like this:

def metric_auc(labels, predictions):
    return {
        'auc_precision_recall': tf.metrics.auc(
            labels=labels, predictions=predictions['logistic'], num_thresholds=200,
            curve='PR', summation_method='careful_interpolation')
    }

estimator = tf.estimator.DNNClassifier(...)
estimator = tf.contrib.estimator.add_metrics(estimator, metric_auc)

When evaluating, the warning message will still appear, but the AUC with careful interpolation will be called shortly afterwards. Assigning this metric to a different key would also allow you to check the discrepancy between the two summation methods. My tests on a multi-label logistic regression task show that the measurements may indeed be slightly different: auc_precision_recall = 0.05173396, auc_precision_recall_careful = 0.05059402.


There is also a reason why the default summation method is still 'trapezoidal', in spite of the documentation suggesting that careful interpolation is "strictly preferred". As commented in pull request #19079, the change would be significantly backwards incompatible. Subsequent comments on the same pull request suggest the workaround above.

like image 136
E_net4 stands with Ukraine Avatar answered Mar 14 '23 17:03

E_net4 stands with Ukraine