How to use keras ReduceLROnPlateau

Tags:

I am training a keras sequential model. I want the learning rate to be reduced when training is not progressing.

I use ReduceLROnPlateau callback.

After first 2 epoch with out progress, the learning rate is reduced as expected. But then its reduced every 2 epoch's, causing the training to stop progressing.

Is that a keras bug ? or I use the function the wrong way ?

The code:

earlystopper = EarlyStopping(patience=8, verbose=1)
checkpointer = ModelCheckpoint(filepath = 'model_zero7.{epoch:02d}-{val_loss:.6f}.hdf5',
                               verbose=1,
                               save_best_only=True, save_weights_only = True)

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
                              patience=2, min_lr=0.000001, verbose=1)

history_zero7 = model_zero.fit_generator(bach_gen_only1,
                                        validation_data = (v_im, v_lb),
                                        steps_per_epoch=25,epochs=100,
                    callbacks=[earlystopper, checkpointer, reduce_lr])

The output:

Epoch 00006: val_loss did not improve from 0.68605
Epoch 7/100
25/25 [==============================] - 213s 9s/step - loss: 0.6873 - binary_crossentropy: 0.0797 - dice_coef_loss: -0.8224 - jaccard_distance_loss_flat: 0.2998 - val_loss: 0.6865 - val_binary_crossentropy: 0.0668 - val_dice_coef_loss: -0.8513 - val_jaccard_distance_loss_flat: 0.2578

Epoch 00007: val_loss did not improve from 0.68605

Epoch 00007: ReduceLROnPlateau reducing learning rate to 0.000200000009499.
Epoch 8/100
25/25 [==============================] - 214s 9s/step - loss: 0.6865 - binary_crossentropy: 0.0648 - dice_coef_loss: -0.8547 - jaccard_distance_loss_flat: 0.2528 - val_loss: 0.6860 - val_binary_crossentropy: 0.0694 - val_dice_coef_loss: -0.8575 - val_jaccard_distance_loss_flat: 0.2485

Epoch 00008: val_loss improved from 0.68605 to 0.68598, saving model to model_zero7.08-0.685983.hdf5
Epoch 9/100
25/25 [==============================] - 208s 8s/step - loss: 0.6868 - binary_crossentropy: 0.0624 - dice_coef_loss: -0.8554 - jaccard_distance_loss_flat: 0.2518 - val_loss: 0.6860 - val_binary_crossentropy: 0.0746 - val_dice_coef_loss: -0.8527 - val_jaccard_distance_loss_flat: 0.2557

Epoch 00009: val_loss improved from 0.68598 to 0.68598, saving model to model_zero7.09-0.685982.hdf5

Epoch 00009: ReduceLROnPlateau reducing learning rate to 4.00000018999e-05.
Epoch 10/100
25/25 [==============================] - 211s 8s/step - loss: 0.6865 - binary_crossentropy: 0.0640 - dice_coef_loss: -0.8570 - jaccard_distance_loss_flat: 0.2493 - val_loss: 0.6859 - val_binary_crossentropy: 0.0630 - val_dice_coef_loss: -0.8688 - val_jaccard_distance_loss_flat: 0.2311

Epoch 00010: val_loss improved from 0.68598 to 0.68589, saving model to model_zero7.10-0.685890.hdf5
Epoch 11/100
25/25 [==============================] - 211s 8s/step - loss: 0.6869 - binary_crossentropy: 0.0610 - dice_coef_loss: -0.8580 - jaccard_distance_loss_flat: 0.2480 - val_loss: 0.6859 - val_binary_crossentropy: 0.0681 - val_dice_coef_loss: -0.8616 - val_jaccard_distance_loss_flat: 0.2422

Epoch 00011: val_loss improved from 0.68589 to 0.68589, saving model to model_zero7.11-0.685885.hdf5
Epoch 12/100
25/25 [==============================] - 210s 8s/step - loss: 0.6866 - binary_crossentropy: 0.0575 - dice_coef_loss: -0.8612 - jaccard_distance_loss_flat: 0.2426 - val_loss: 0.6858 - val_binary_crossentropy: 0.0636 - val_dice_coef_loss: -0.8679 - val_jaccard_distance_loss_flat: 0.2325

Epoch 00012: val_loss improved from 0.68589 to 0.68585, saving model to model_zero7.12-0.685847.hdf5

Epoch 00012: ReduceLROnPlateau reducing learning rate to 8.0000005255e-06.

The first 6 epoch:

Epoch 1/100
25/25 [==============================] - 254s 10s/step - loss: 0.6886 - binary_crossentropy: 0.1356 - dice_coef_loss: -0.7302 - jaccard_distance_loss_flat: 0.4151 - val_loss: 0.6867 - val_binary_crossentropy: 0.1013 - val_dice_coef_loss: -0.8161 - val_jaccard_distance_loss_flat: 0.3096

Epoch 00001: val_loss improved from inf to 0.68673, saving model to model_zero7.01-0.686732.hdf5
Epoch 2/100
25/25 [==============================] - 211s 8s/step - loss: 0.6871 - binary_crossentropy: 0.0805 - dice_coef_loss: -0.8274 - jaccard_distance_loss_flat: 0.2932 - val_loss: 0.6865 - val_binary_crossentropy: 0.1005 - val_dice_coef_loss: -0.8100 - val_jaccard_distance_loss_flat: 0.3183

Epoch 00002: val_loss improved from 0.68673 to 0.68653, saving model to model_zero7.02-0.686533.hdf5
Epoch 3/100
25/25 [==============================] - 214s 9s/step - loss: 0.6871 - binary_crossentropy: 0.0778 - dice_coef_loss: -0.8268 - jaccard_distance_loss_flat: 0.2934 - val_loss: 0.6863 - val_binary_crossentropy: 0.0811 - val_dice_coef_loss: -0.8402 - val_jaccard_distance_loss_flat: 0.2743

Epoch 00003: val_loss improved from 0.68653 to 0.68635, saving model to model_zero7.03-0.686345.hdf5
Epoch 4/100
25/25 [==============================] - 210s 8s/step - loss: 0.6869 - binary_crossentropy: 0.0692 - dice_coef_loss: -0.8397 - jaccard_distance_loss_flat: 0.2749 - val_loss: 0.6862 - val_binary_crossentropy: 0.0820 - val_dice_coef_loss: -0.8445 - val_jaccard_distance_loss_flat: 0.2682

Epoch 00004: val_loss improved from 0.68635 to 0.68621, saving model to model_zero7.04-0.686206.hdf5
Epoch 5/100
25/25 [==============================] - 208s 8s/step - loss: 0.6868 - binary_crossentropy: 0.0693 - dice_coef_loss: -0.8446 - jaccard_distance_loss_flat: 0.2676 - val_loss: 0.6861 - val_binary_crossentropy: 0.0761 - val_dice_coef_loss: -0.8495 - val_jaccard_distance_loss_flat: 0.2606

Epoch 00005: val_loss improved from 0.68621 to 0.68605, saving model to model_zero7.05-0.686055.hdf5
Epoch 6/100
25/25 [==============================] - 203s 8s/step - loss: 0.6874 - binary_crossentropy: 0.0792 - dice_coef_loss: -0.8200 - jaccard_distance_loss_flat: 0.3024 - val_loss: 0.6865 - val_binary_crossentropy: 0.0559 - val_dice_coef_loss: -0.8716 - val_jaccard_distance_loss_flat: 0.2269

Epoch 00006: val_loss did not improve from 0.68605

779

asked Aug 17 '18 06:08

Naomi Fridman

2 Answers

I don't think this should be blame on that bug because it seems be fixed already in 2016. Note that there is an positive argument in this function:

min_delta: threshold for measuring the new optimum, to only focus on significant changes.

Which is set as 0.0001 by default. Therefore even though val_loss improved from last epoch, if the reduction is smaller than min_delta. It will still be regard as bad lr.

134

answered Oct 14 '22 11:10

chongkai Lu

Well it is a bug in keras. https://github.com/keras-team/keras/issues/3991

To solve it, use: cooldown=1

answered Oct 14 '22 12:10

Naomi Fridman

Related questions
                            
                                time complexity of random access in deque in Python [duplicate]
                            
                                Using pip on Windows installed with both python 2.7 and 3.5
                            
                                Python kernel dies for second run of PyQt5 GUI
                            
                                Use Numpy to convert rgb pixel array into grayscale [duplicate]
                            
                                Single instance of class in Python
                            
                                How to normalize a 4D numpy array?
                            
                                Color scatter plot points based on a value in third column?
                            
                                Pandas: what is a NDFrame object (and what is a non-NDFrame object)
                            
                                How to I factorize a list of tuples?
                            
                                Python dynamic import - how to import * from module name from variable?
                            
                                Python: DeprecationWarning: elementwise == comparison failed; this will raise an error in the future
                            
                                What does tf.global_variables_initializer() do under the hood?
                            
                                Convert Numpy array to Pandas DataFrame column-wise (As Single Row)
                            
                                Python: Handling newlines in json.load() vs json.loads()
                            
                                How to mock a function, that is imported within an imported method from different module
                            
                                Python requests with HTTPAdapter is halting for hours
                            
                                How to keep original index of a DataFrame after groupby 2 columns?
                            
                                Most dominant color in RGB image - OpenCV / NumPy / Python
                            
                                Dynamically access oneof value from a protobuf
                            
                                twine not found (-bash: twine: command not found)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use keras ReduceLROnPlateau

Tags:

python

neural-network

deep-learning

keras

keras-2

Naomi Fridman

People also ask

2 Answers

chongkai Lu

Naomi Fridman

Recent Activity

Donate For Us