Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Upweight a Category

I have built a TensorFlow model that uses a DNNClassifier to classify input into two categories.

My problem is that Outcome 1 occurs upwards of 90-95% of the time. Therefore, TensorFlow is giving me the same probabilities for all of my predictions.

I am trying to predict the other outcome (e.g. having a false positive for Outcome 2 is preferable to missing a possible occurrence of Outcome 2). I know that in machine learning in general, in this case it would be worthwhile to try to upweight Outcome 2.

However, I don't know how to do this in TensorFlow. The documentation alludes to it being possible, but I can't find any examples of what it would actually look like. Has anyone has successfully done this, or does anyone know where I could find some example code or a thorough explanation (I'm using Python)?

Note: I have seen exposed weights being manipulated when someone is using the more fundamental parts of TensorFlow and not an estimator. For maintenance reasons, I need to do this using an estimator.

like image 709
Abigail Fox Avatar asked Jan 04 '18 15:01

Abigail Fox


People also ask

What does upweight mean?

Finance increase the proportion of (an asset or asset class) in a portfolio or fund: an opportunity to upweight equities where feasible within your risk profile. Pick a style below, and copy the text for your bibliography. "upweight ." The Oxford Pocket Dictionary of Current English. .

What does it mean when a category has more weight?

This just means that your teacher has decided some scoring categories (like homework or tests) are more important than others. The more "weight" a category has, the more it affects your final score.

What does it mean when a grade is weighted?

When you're in school, you'll frequently see a weighted scoring method used to calculate your grades. This just means that your teacher has decided some scoring categories (like homework or tests) are more important than others. The more "weight" a category has, the more it affects your final score.

What does it mean when a scoring category is heavy?

This just means that your teacher has decided some scoring categories (like homework or tests) are more important than others. The more "weight" a category has, the more it affects your final score. Sciencing_Icons_Science


1 Answers

tf.estimator.DNNClassifier constructor has weight_column argument:

weight_column: A string or a _NumericColumn created by tf.feature_column.numeric_column defining feature column representing weights. It is used to down weight or boost examples during training. It will be multiplied by the loss of the example. If it is a string, it is used as a key to fetch weight tensor from the features. If it is a _NumericColumn, raw tensor is fetched by key weight_column.key, then weight_column.normalizer_fn is applied on it to get weight tensor.

So just add a new column and fill it with some weight for the rare class:

weight = tf.feature_column.numeric_column('weight')
...
tf.estimator.DNNClassifier(..., weight_column=weight)

[Update] Here's a complete working example:

import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('mnist', one_hot=False)
train_x, train_y = mnist.train.next_batch(1024)
test_x, test_y = mnist.test.images, mnist.test.labels

x_column = tf.feature_column.numeric_column('x', shape=[784])
weight_column = tf.feature_column.numeric_column('weight')
classifier = tf.estimator.DNNClassifier(feature_columns=[x_column],
                                        hidden_units=[100, 100],
                                        weight_column=weight_column,
                                        n_classes=10)

# Training
train_input_fn = tf.estimator.inputs.numpy_input_fn(x={'x': train_x, 'weight': np.ones(train_x.shape[0])},
                                                    y=train_y.astype(np.int32),
                                                    num_epochs=None, shuffle=True)
classifier.train(input_fn=train_input_fn, steps=1000)

# Testing
test_input_fn = tf.estimator.inputs.numpy_input_fn(x={'x': test_x, 'weight': np.ones(test_x.shape[0])},
                                                   y=test_y.astype(np.int32),
                                                   num_epochs=1, shuffle=False)
acc = classifier.evaluate(input_fn=test_input_fn)
print('Test Accuracy: %.3f' % acc['accuracy'])
like image 110
Maxim Avatar answered Oct 27 '22 02:10

Maxim