Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use sample weights with tensorflow datasets?

I have been training a unet model for multiclass semantic segmentation in python using Tensorflow and Tensorflow Datasets.

I've noticed that one of my classes seems to be underrepresented in training. After doing some research, I found out about sample weights and thought it might be a good solution to my problem, but I've been having trouble deciphering the documentation on how to use it or find examples of it being used.

Could someone help explain how sample weights come into play with datasets for training or point me to an example where it is being implemented? Or even what type of input the model.fit function is expecting would be helpful.

like image 601
jtheck314 Avatar asked Mar 25 '21 15:03

jtheck314


People also ask

What is sample weight in TensorFlow?

A "sample weights" array is an array of numbers that specify how much weight each sample in a batch should have in computing the total loss. It is commonly used in imbalanced classification problems (the idea being to give more weight to rarely-seen classes).

How do sample weights work?

Sampling weights are often thereciprocalof the likelihood of being sampled (i.e., selection probability) of the sampling unit. For example, if you have selected 200 goldfish out of a population of 1000, the reciprocal of the likelihood of being selected is 1000/200, so the sampling weight for the goldfish would be 5.


1 Answers

From the documentation of tf.keras model.fit():

sample_weight

[...] This argument is not supported when x is a dataset, generator, or keras.utils.Sequence instance, instead provide the sample_weights as the third element of x.

What is meant by that? This is demonstrated for the Dataset case in one of the official documentation turorials:

sample_weight = np.ones(shape=(len(y_train),))
sample_weight[y_train == 5] = 2.0

# Create a Dataset that includes sample weights
# (3rd element in the return tuple).
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train, sample_weight))

# Shuffle and slice the dataset.
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)

model = get_compiled_model()
model.fit(train_dataset, epochs=1)

See the link for a full-fledged example.

like image 119
desertnaut Avatar answered Oct 23 '22 02:10

desertnaut