Why is ReLU used in regression with Neural Networks?

Tags:

I am following the official TensorFlow with Keras tutorial and I got stuck here: Predict house prices: regression - Create the model

Why is an activation function used for a task where a continuous value is predicted?

The code is:

Click to copy

def build_model():
    model = keras.Sequential([
        keras.layers.Dense(64, activation=tf.nn.relu, 
                   input_shape=(train_data.shape[1],)),
        keras.layers.Dense(64, activation=tf.nn.relu),
        keras.layers.Dense(1)
    ])

    optimizer = tf.train.RMSPropOptimizer(0.001)

    model.compile(loss='mse', optimizer=optimizer, metrics=['mae'])
    return model

768

asked Jul 20 '18 12:07

Popovici Andrei-Sorin

1 Answers

The general reason for using non-linear activation functions in hidden layers is that, without them, no matter how many layers or how many units per layer, the network would behave just like a simple linear unit. This is nicely explained in this short video by Andrew Ng: Why do you need non-linear activation functions?

In your case, looking more closely, you'll see that the activation function of your final layer is not the relu as in your hidden layers, but the linear one (which is the default activation when you don't specify anything, like here):

Click to copy

keras.layers.Dense(1)

From the Keras docs:

Dense

[...]

Arguments

[...]

activation: Activation function to use (see activations). If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).

which is indeed what is expected for a regression network with a single continuous output.

127

answered Oct 12 '22 09:10

desertnaut

Related questions
                            
                                Scikit-learn Random Forest out of bag sample
                            
                                Quadratic transformation of a variable
                            
                                R: How do we print percentage accuracy for SVM
                            
                                Scikit Learn - Extract word tokens from a string delimiter using CountVectorizer
                            
                                LSTM/RNN many to one
                            
                                Multi-label out-of-core learning for text data: ValueError on partial fit
                            
                                Why do we need epochs?
                            
                                MXNet print intermediate symbol values
                            
                                XGB via Scikit learn API doesn't seem to be running in GPU although compiled to run for GPU
                            
                                Using Alternative Distance Metrics like Mahalanobis with DBSCAN
                            
                                Obtain tf-idf weights of words with sklearn
                            
                                How to add labels to t-SNE in python
                            
                                The best loss function for pixelwise binary classification in keras
                            
                                How to access elements inside MLMultiArray in CoreML
                            
                                Matrix dimensions not matching in back propagation
                            
                                'KD tree' with custom distance metric
                            
                                export Keras model to .pb file and optimize for inference gives random guess on Android
                            
                                Using LSTM to predict a simple synthetic time series. Why is it that bad?
                            
                                Extract features into a dataset from keras model
                            
                                Scikit-learn how to check if model (e.g. TfidfVectorizer) has been already fit

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is ReLU used in regression with Neural Networks?

Tags:

machine-learning

neural-network

keras

regression

activation-function

Popovici Andrei-Sorin

People also ask

1 Answers

desertnaut

Recent Activity

Donate For Us