I have trying to setup a non-linear regression problem in Keras. Unfortunately, results show that overfitting is occurring. Here is the code, <pre class="prettyprint"><code>model = Sequential() model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0))) model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0))) model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0))) model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0))) model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0))) model.add(Dense(outdim, activation='linear')) Adam = optimizers.Adam(lr=0.001) model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae']) model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0) </code></pre> The results without regularization is shown here Without regularization. The mean absolute error for training is much less compared to validation, and both have a fixed gap which is a sign of over-fitting. L2 regularization was specified for each layer like so, <pre class="prettyprint"><code>model = Sequential() model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0.001))) model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0.001))) model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0.001))) model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001))) model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001))) model.add(Dense(outdim, activation='linear')) Adam = optimizers.Adam(lr=0.001) model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae']) model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0) </code></pre> The results for these are shown here L2 regularized result. The MAE for test is close to training which is good. However, the MAE for training is poor at 0.03 (without regularization it was much lower at 0.0028). What can i do to reduce the training MAE with regularization?

Based on your results, it looks like you need to find the right amount of regularization to balance training accuracy with good generalization to the test set. This may be as simple as reducing the L2 parameter. Try reducing lambda from 0.001 to 0.0001 and comparing your results. If you can't find a good parameter setting for L2, you could try dropout regularization instead. Just add <code>model.add(Dropout(0.2))</code> between each pair of dense layers, and experiment with the dropout rate if necessary. A higher dropout rate corresponds to more regularization.

Regularization strategy in Keras

Tags:

keras

regression

regularized

I have trying to setup a non-linear regression problem in Keras. Unfortunately, results show that overfitting is occurring. Here is the code,

model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)

The results without regularization is shown here Without regularization. The mean absolute error for training is much less compared to validation, and both have a fixed gap which is a sign of over-fitting.

L2 regularization was specified for each layer like so,

model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)

The results for these are shown here L2 regularized result. The MAE for test is close to training which is good. However, the MAE for training is poor at 0.03 (without regularization it was much lower at 0.0028).

What can i do to reduce the training MAE with regularization?

780

asked Jan 11 '18 02:01

trumee

1 Answers

Based on your results, it looks like you need to find the right amount of regularization to balance training accuracy with good generalization to the test set. This may be as simple as reducing the L2 parameter. Try reducing lambda from 0.001 to 0.0001 and comparing your results.

If you can't find a good parameter setting for L2, you could try dropout regularization instead. Just add model.add(Dropout(0.2)) between each pair of dense layers, and experiment with the dropout rate if necessary. A higher dropout rate corresponds to more regularization.

answered Oct 29 '22 05:10

Imran

Related questions
                            
                                Performance bottleneck on the CPU side
                            
                                Keras reports TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
                            
                                How to deal with length variations for text classification using CNN (Keras)
                            
                                Default activation function in Keras
                            
                                Why in Keras subclassing API, the call method is never called and as an alternative the input is passed by calling the object of this class?
                            
                                How to run tensorflow inference for multiple models on GPU in parallel?
                            
                                "g++ not detected" while data set goes larger, is there any limit to matrix size in GPU?
                            
                                What is the difference between these two ways of adding Neural Network layers in Keras?
                            
                                What does it mean when train and validation loss diverge from epoch 1?
                            
                                How to load Image Masks (Labels) for Image Segmentation in Keras
                            
                                Keras: binary_crossentropy & categorical_crossentropy confusion
                            
                                Getting a list of all known classes of vgg-16 in keras
                            
                                Keras custom RMSLE metric
                            
                                Reproducible results using Keras with TensorFlow backend
                            
                                Keras - how to get unnormalized logits instead of probabilities
                            
                                Removing layers from a pretrained keras model gives the same output as original model
                            
                                Why is the Mean Average Percentage Error(mape) extremely high?
                            
                                How to load the Keras model with custom layers from .h5 file correctly?
                            
                                Image augmentation makes performance worse [closed]
                            
                                How to use TensorFlow metrics in Keras

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With