I am trying to detect the single pixel location of a single object in an image. I have a keras CNN regression network with my image tensor as the input, and a 3 item vector as the output.
First item: Is a 1 (if an object was found) or 0 (no object was found)
Second item: Is a number between 0 and 1 which indicates how far along the x axis is the object
Third item: Is a number between 0 and 1 which indicates how far along the y axis is the object
I have trained the network on 2000 test images and 500 validation images, and the val_loss is far less than 1, and the val_acc is best at around 0.94. Excellent.
But then when I predict the output, I find the values for all three output items are not between 0 and 1, they are actually between -2 and 3 approximately. All three items should be between 0 and 1.
I have not used any non-linear activation functions on the output layer, and have used relus for all non-output layers. Should I be using a softmax, even though it is non-linear? The second and third items are predicting the x and y axis of the image, which appear to me as linear quantities.
Here is my keras network:
inputs = Input((256, 256, 1))
base_kernels = 64
# 256
conv1 = Conv2D(base_kernels, 3, activation='relu', padding='same', kernel_initializer='he_normal')(inputs)
conv1 = BatchNormalization()(conv1)
conv1 = Conv2D(base_kernels, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv1)
conv1 = BatchNormalization()(conv1)
conv1 = Dropout(0.2)(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
# 128
conv2 = Conv2D(base_kernels * 2, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool1)
conv2 = BatchNormalization()(conv2)
conv2 = Conv2D(base_kernels * 2, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv2)
conv2 = BatchNormalization()(conv2)
conv2 = Dropout(0.2)(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
# 64
conv3 = Conv2D(base_kernels * 4, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool2)
conv3 = BatchNormalization()(conv3)
conv3 = Conv2D(base_kernels * 4, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv3)
conv3 = BatchNormalization()(conv3)
conv3 = Dropout(0.2)(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
flat = Flatten()(pool3)
dense = Dense(256, activation='relu')(flat)
output = Dense(3)(dense)
model = Model(inputs=[inputs], outputs=[output])
optimizer = Adam(lr=1e-4)
model.compile(optimizer=optimizer, loss='mean_absolute_error', metrics=['accuracy'])
Can anyone please help? Thanks! :) Chris
The sigmoid activation produces outputs between zero and one, so if you use it as activation of your last layer(the output), the network's output will be between zero and one.
output = Dense(3, activation="sigmoid")(dense)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With