Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple Linear Regression using Keras

I have been trying to implement a simple linear regression model using neural networks in Keras in hopes to understand how do we work in Keras library. Unfortunately, I am ending up with a very bad model. Here is the implementation:

from pylab import *
from keras.models import Sequential
from keras.layers import Dense

#Generate dummy data
data = data = linspace(1,2,100).reshape(-1,1)
y = data*5

#Define the model
def baseline_model():
   model = Sequential()
   model.add(Dense(1, activation = 'linear', input_dim = 1))
   model.compile(optimizer = 'rmsprop', loss = 'mean_squared_error', metrics = ['accuracy'])
   return model


#Use the model
regr = baseline_model()
regr.fit(data,y,epochs =200,batch_size = 32)
plot(data, regr.predict(data), 'b', data,y, 'k.')

The generated plot is as follows:

Plot

Can somebody point out the flaw in the above definition of the model (which could ensure a better fit)?

like image 719
mathnoob123 Avatar asked Jul 04 '18 22:07

mathnoob123


3 Answers

You should increase the learning rate of optimizer. The default value of learning rate in RMSprop optimizer is set to 0.001, therefore the model takes a few hundred epochs to converge to a final solution (probably you have noticed this yourself that the loss value decreases slowly as shown in the training log). To set the learning rate import optimizers module:

from keras import optimizers

# ...
model.compile(optimizer=optimizers.RMSprop(lr=0.1), loss='mean_squared_error', metrics=['mae'])

Either of 0.01 or 0.1 should work fine. After this modification you may not need to train the model for 200 epochs. Even 5, 10 or 20 epochs may be enough.

Also note that you are performing a regression task (i.e. predicting real numbers) and 'accuracy' as metric is used when you are performing a classification task (i.e. predicting discrete labels like category of an image). Therefore, as you can see above, I have replaced it with mae (i.e. mean absolute error) which is also much more interpretable than the value of loss (i.e. mean squared error) used here.

like image 80
today Avatar answered Sep 18 '22 00:09

today


The below code best fits for your data.

Take a look at this.

from pylab import *
from keras.models import Sequential
from keras.layers import Dense
import matplotlib.pyplot as plt
%matplotlib inline

​# Generate dummy data

data = data = linspace(1,2,100).reshape(-1,1)
y = data*5

​ # Define the model

def baseline_model():
    global num_neurons
    model = Sequential()
    model.add(Dense(num_neurons, activation = 'linear', input_dim = 1))
    model.add(Dense(1 , activation = 'linear'))
    model.compile(optimizer = 'rmsprop', loss = 'mean_squared_error')
    return model

set num_neurons in first dense layer

** You may change it later

num_neurons = 17

​ ​#Use the model

regr = baseline_model()
regr.fit(data,y,epochs =200, verbose = 0)
plot(data, regr.predict(data), 'bo', data,y, 'k-')

the first plot with num_neurons = 17 , is good fit.

But even we can explore more.

click on the links below to see the plots

Plot for num_neurons = 12

Plot for num_neurons = 17

Plot for num_neurons = 19

Plot for num_neurons = 20

You can see that , as we increase the num of neurons

our model is becoming more intelligent. and best fit.

I hope you got it.

Thank you

like image 27
Kunam Avatar answered Sep 21 '22 00:09

Kunam


Interesting problem, so I plugged the dataset into a model-builder framework I wrote. The framework has two callbacks: EarlyStopping callback for 'loss' and Tensorboard for analysis. I removed the 'metric' attribute from the model compile - unnecessary, and should be 'mae' anyways.

@mathnoob123 model as written and learning rate(lr) = 0.01 had a loss=1.2623e-06 in 2968 epochs

BUT the same model replacing the RMSprop optimizer with Adam was most accurate with loss=1.22e-11 in 2387 epochs.

The best compromise I found was @mathnoob123 model using lr=0.01 results in loss=1.5052e-4 in 230 epochs. This is better than @Kunam's model with 20 nodes in the first dense layer with lr=0.001: loss=3.1824e-4 in 226 epochs.

Now I'm going to look at randomizing the label dataset (y) and see what happens....

like image 29
dturvene Avatar answered Sep 19 '22 00:09

dturvene