In my project, I am using the premade estimator DNNClassifier
.
Here is my estimator:
model = tf.estimator.DNNClassifier(
hidden_units=network,
feature_columns=feature_cols,
n_classes= 2,
activation_fn=tf.nn.relu,
optimizer=tf.train.ProximalAdagradOptimizer(
learning_rate=0.1,
l1_regularization_strength=0.001
),
config=chk_point_run_config,
model_dir=MODEL_CHECKPOINT_DIR
)
when I evaluate the model using eval_res = model.evaluate(..)
,
I get the following warning:
WARNING:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.
How I can switch to careful_interpolation to get the correct results from the evaluate()
method?
Tensorflow version: 1.8
Unfortunately, the use of a pre-made estimator leaves little freedom for customizing the evaluation process. Currently, a DNNClassifier
does not seem to provide a means to adjust the evaluation metrics, likewise for other estimators.
Albeit not ideal, one solution is to augment an estimator with the desired metrics using tf.contrib.metrics.add_metrics
, which will replace the old metric if the exact same key is assigned to the new one:
If there is a name conflict between this and estimators existing metrics, this will override the existing one.
It comes with the advantage of working for any estimator that produces probabilistic predictions, at the expense of still calculating the overridden metric for each evaluation. A DNNClassifier
estimator provides logistic values (between 0 and 1) under the key 'logistic'
(the list of possible keys in canned estimators are here). This might not always be the case for other estimator heads, but alternatives may be available: in a multi-label classifier built with tf.contrib.estimator.multi_label_head
, logistic
is not available, but probabilities
can be used instead.
Hence, the code would look like this:
def metric_auc(labels, predictions):
return {
'auc_precision_recall': tf.metrics.auc(
labels=labels, predictions=predictions['logistic'], num_thresholds=200,
curve='PR', summation_method='careful_interpolation')
}
estimator = tf.estimator.DNNClassifier(...)
estimator = tf.contrib.estimator.add_metrics(estimator, metric_auc)
When evaluating, the warning message will still appear, but the AUC with careful interpolation will be called shortly afterwards. Assigning this metric to a different key would also allow you to check the discrepancy between the two summation methods. My tests on a multi-label logistic regression task show that the measurements may indeed be slightly different: auc_precision_recall = 0.05173396, auc_precision_recall_careful = 0.05059402.
There is also a reason why the default summation method is still 'trapezoidal'
, in spite of the documentation suggesting that careful interpolation is "strictly preferred". As commented in pull request #19079, the change would be significantly backwards incompatible. Subsequent comments on the same pull request suggest the workaround above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With