Tensorflow estimator: Switching to careful_interpolation to get the correct PR-AUC of a model

Question

In my project, I am using the premade estimator DNNClassifier. Here is my estimator:

model = tf.estimator.DNNClassifier(
        hidden_units=network,
        feature_columns=feature_cols,
        n_classes= 2,
        activation_fn=tf.nn.relu,
        optimizer=tf.train.ProximalAdagradOptimizer(
            learning_rate=0.1,
            l1_regularization_strength=0.001
        ),
        config=chk_point_run_config,
        model_dir=MODEL_CHECKPOINT_DIR
    )

when I evaluate the model using eval_res = model.evaluate(..), I get the following warning:

WARNING:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

How I can switch to careful_interpolation to get the correct results from the evaluate() method?

Tensorflow version: 1.8

E_net4 stands with Ukraine · Accepted Answer

Unfortunately, the use of a pre-made estimator leaves little freedom for customizing the evaluation process. Currently, a DNNClassifier does not seem to provide a means to adjust the evaluation metrics, likewise for other estimators.

Albeit not ideal, one solution is to augment an estimator with the desired metrics using tf.contrib.metrics.add_metrics, which will replace the old metric if the exact same key is assigned to the new one:

If there is a name conflict between this and estimators existing metrics, this will override the existing one.

It comes with the advantage of working for any estimator that produces probabilistic predictions, at the expense of still calculating the overridden metric for each evaluation. A DNNClassifier estimator provides logistic values (between 0 and 1) under the key 'logistic' (the list of possible keys in canned estimators are here). This might not always be the case for other estimator heads, but alternatives may be available: in a multi-label classifier built with tf.contrib.estimator.multi_label_head, logistic is not available, but probabilities can be used instead.

Hence, the code would look like this:

def metric_auc(labels, predictions):
    return {
        'auc_precision_recall': tf.metrics.auc(
            labels=labels, predictions=predictions['logistic'], num_thresholds=200,
            curve='PR', summation_method='careful_interpolation')
    }

estimator = tf.estimator.DNNClassifier(...)
estimator = tf.contrib.estimator.add_metrics(estimator, metric_auc)

When evaluating, the warning message will still appear, but the AUC with careful interpolation will be called shortly afterwards. Assigning this metric to a different key would also allow you to check the discrepancy between the two summation methods. My tests on a multi-label logistic regression task show that the measurements may indeed be slightly different: auc_precision_recall = 0.05173396, auc_precision_recall_careful = 0.05059402.

There is also a reason why the default summation method is still 'trapezoidal', in spite of the documentation suggesting that careful interpolation is "strictly preferred". As commented in pull request #19079, the change would be significantly backwards incompatible. Subsequent comments on the same pull request suggest the workaround above.

Tensorflow estimator: Switching to careful_interpolation to get the correct PR-AUC of a model

Tags:

python

tensorflow

Ashiqur Rahman

1 Answers

E_net4 stands with Ukraine

Recent Activity

Donate For Us

Tensorflow estimator: Switching to careful_interpolation to get the correct PR-AUC of a model

Tags:

python

tensorflow

Ashiqur Rahman

1 Answers

E_net4 stands with Ukraine

Related questions

Recent Activity

Donate For Us