Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using different sample weights for each output in a multi-output Keras model

My input array is image_array, containing data of 10000 images of size 512x512 with 4 channels. I.e. image_array.shape = (10000, 512, 512, 4). Each of those images has an associated metric I want to train a CNN to predict for me. Hence metric_array.shape = (10000). Since I do not want the network to be biased towards values of the metric which just occur more frequently, I have a weighting array, containing a weight for each value of the metric. Hence weightArray.shape = (10000).

I am using Keras. This is my Sequential model:

model = Sequential()
model.add(Conv2D(32, use_bias=True, kernel_size=(3,3), strides=(1, 1), activation='relu', input_shape=(512,512,4))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(64, use_bias=True, kernel_size=(3,3), strides=(1, 1), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(128, use_bias=True, kernel_size=(3,3), strides=(1, 1), activation='relu'))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(32))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(Dense(1, activation=relu_max))

I want to use the mean squared error loss function and the stochastic gradient descent optimizer. I compile my model:

model.compile(loss='mean_squared_error', optimizer=optimizers.SGD(lr=0.01))

I split my dataset into training and validation:

X_train, X_validate, Y_train, Y_validate, W_train, W_validate \
= train_test_split(image_array, metric_array, weightArray, test_size=0.3)

And finally train the model:

model.fit(X_train, Y_train, epochs=100, batch_size = 32, \
           validation_data=(X_validate,Y_validate), sample_weight=W_train)

All the above works. Now, what I would like to do is to use 2 metrics instead of one. I have a value of metric1 and a value of metric2 for each image. And each of the values of metric1 and of metric2 have an associated weight. Hence

metric_array1.shape = metric_array2.shape = weightArray1.shape = weightArray2.shape = (10000)

My network would then have two output nodes, one for each metric.

I tried changing the last layer above to:

model.add(Dense(2, activation=relu_max))

I then combined the metric and weight data into a metric_array and a weightArray of tuples, with shape (10000, 2). This led me to finding out that a Sequential model is designed for a single output, and that hence I should use a Functional model instead.

I have read some of the documentation and it seems quite complicated. I tried using the model above (but with 2 nodes in the last layer) and then doing

from keras.models import Model
new_model = Model(model)

But it did not like it when I tried to compile it, because Model does not have the option .add.

Is there a simple way of modifying what I already have to obtain my new purpose? I would really appreciate any guidance.

like image 728
Luismi98 Avatar asked Aug 04 '19 18:08

Luismi98


1 Answers

First of all, let's clear up a misunderstanding:

If your model has one output/input layer then you can use Sequential API to construct your model, regardless of the number of neurons in the output and input layers. On the other hand, if your model has multiple output/input layers, then you must use Functional API to define your model (no matter how many neurons the input/output layers might have).

Now, you have stated that your model has two output values and for each output value you want to use a different sample weighting. To be able to do that, your model must have two output layers, and then you can set the sample_weight argument as a dictionary containing two weight arrays corresponding to two output layers.

To make it more clear, consider this dummy example:

from keras import layers
from keras import models 
import numpy as np

inp = layers.Input(shape=(5,))
# assign names to output layers for more clarity
out1 = layers.Dense(1, name='out1')(inp)
out2 = layers.Dense(1, name='out2')(inp)

model = models.Model(inp, [out1, out2])
model.compile(loss='mse',
              optimizer='adam')

# create some dummy training data as well as sample weight
n_samples = 100
X = np.random.rand(n_samples, 5)
y1 = np.random.rand(n_samples,1)
y2 = np.random.rand(n_samples,1)

w1 = np.random.rand(n_samples,)
w2 = np.random.rand(n_samples,)

model.fit(X, [y1, y2], epochs=5, batch_size=16, sample_weight={'out1': w1, 'out2': w2})
like image 95
today Avatar answered Nov 10 '22 12:11

today