Keras

Question

I am training a VGG-like convnet (like in the example http://keras.io/examples/) with a set of images. I convert images to arrays and resize them using scipy:

mapper = [] # list of photo ids
data = np.empty((NB_FILES, 3, 100, 100)).astype('float32')
i = 0
for f in onlyfiles[:NB_FILES]:
    img = load_img(mypath + f)
    a = img_to_array(img)

    a_resize = np.empty((3, 100, 100))
    a_resize[0,:,:] = sp.misc.imresize(a[0,:,:], (100,100)) / 255.0 # - 0.5
    a_resize[1,:,:] = sp.misc.imresize(a[1,:,:], (100,100)) / 255.0 # - 0.5
    a_resize[2,:,:] = sp.misc.imresize(a[2,:,:], (100,100)) / 255.0 # - 0.5

    photo_id = int(f.split('.')[0])
    mapper.append(photo_id)
    data[i, :, :, :] = a_resize; i += 1

In the last dense layer I have 2 neurons and I activate with softmax. Here are the last lines:

model.add(Dense(2))
model.add(Activation('softmax'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)

model.fit(data, target_matrix, batch_size=32, nb_epoch=2, verbose=1, show_accuracy=True, validation_split=0.2)

I am not able to improve reduce the loss and every epoch has the same loss and the same precision as the one before. The loss actually goes up between 1st and 2nd epoch:

Train on 1600 samples, validate on 400 samples
Epoch 1/5
1600/1600 [==============================] - 23s - loss: 3.4371 - acc: 0.7744 - val_loss: 3.8280 - val_acc: 0.7625
Epoch 2/5
1600/1600 [==============================] - 23s - loss: 3.4855 - acc: 0.7837 - val_loss: 3.8280 - val_acc: 0.7625
Epoch 3/5
1600/1600 [==============================] - 23s - loss: 3.4855 - acc: 0.7837 - val_loss: 3.8280 - val_acc: 0.7625
Epoch 4/5
1600/1600 [==============================] - 23s - loss: 3.4855 - acc: 0.7837 - val_loss: 3.8280 - val_acc: 0.7625
Epoch 5/5
1600/1600 [==============================] - 23s - loss: 3.4855 - acc: 0.7837 - val_loss: 3.8280 - val_acc: 0.7625

What am I doing wrong?

Gerome Pistre · Accepted Answer

From my experience, this often happens when the learning rate is too high. The optimization will be unable to find a minima and just "turn around".

The ideal rate will depend on your data and on the architecture of your network.

(As a reference, I'm at the moment running a convnet with 8 layers, on a sample size similar to yours, and the same lack of convergence could be observed until I reduced the learning rate to 0.001)

Avijit Dasgupta · Answer

My suggestions would be to reduce the learning rate, try data augmentation.

Data augmentation code:

print('Using real-time data augmentation.')

    # this will do preprocessing and realtime data augmentation
     datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=True,  # apply ZCA whitening
        rotation_range=90,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

    # compute quantities required for featurewise normalization
    # (std, mean, and principal components if ZCA whitening is applied)
     datagen.fit(X_train)

    # fit the model on the batches generated by datagen.flow()
     model.fit_generator(datagen.flow(X_train, Y_train,
                         batch_size=batch_size),
                         samples_per_epoch=X_train.shape[0],
                         nb_epoch=nb_epoch)

Keras - unable to reduce loss between epochs

Tags:

python

deep-learning

thecheech

2 Answers

Gerome Pistre

Avijit Dasgupta

Recent Activity

Donate For Us