Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use different activation functions in one Keras layer?

I am working on Keras in Python and I have a neural network (see code below). Currently it works with only a ReLu activation.

For experimental reasons I would like to have some neurons on ReLu and some on softmax (or any other activation function). for example in a Layer with 20 neurons, I would like to have 10 with ReLu and 10 with Softmax.

I have tried some different ways, but always failed to get an output.

Would you know how I should do this?

# - Libraries
from keras.layers import Dense
from keras.models import Sequential
from keras.callbacks import EarlyStopping
early_spotting_monitor = EarlyStopping(patience=2)
layers = 4
neurons = 20
act = "ReLu"

# - Create Neural Network
model = Sequential()
model.add(Dense(neurons,activation=act,input_dim=X_train.shape[1]))

layers -= 1
while layers > 0:
    model.add(Dense(neurons,activation=act))  
    layers -= 1
model.add(Dense(n_months))
model.compile(optimizer="adam",loss="mean_absolute_error")

model.fit(X_train,Y_train,validation_split=0.10,epochs=13,callbacks=[early_spotting_monitor])

EDIT: this is my (working) code now:

# - Libraries
from keras.callbacks import EarlyStopping
early_spotting_monitor = EarlyStopping(patience=2)
from keras.layers import Input, Dense
from keras.models import Model       
from keras.layers.merge import concatenate

# input layer
visible = Input(shape=(X_train.shape[1],))

hidden11 = Dense(14, activation='relu')(visible)
hidden12 = Dense(3, activation='softplus')(visible)
hidden13 = Dense(2, activation='linear')(visible)    
hidden13 = Dense(2, activation='selu')(visible)  
merge1 = concatenate([hidden11, hidden12, hidden13])

hidden21 = Dense(14, activation='relu')(merge1)
hidden22 = Dense(3, activation='softplus')(merge1)
hidden23 = Dense(2, activation='linear')(merge1)    
hidden13 = Dense(2, activation='selu')(visible) 
merge2 = concatenate([hidden21, hidden22, hidden23])

hidden3 = Dense(20, activation='relu')(merge2)

output = Dense(Y_train.shape[1],activation="linear")(hidden3)
model = Model(inputs=visible, outputs=output)

model.compile(optimizer="adam",loss="mean_absolute_error")
model.fit(X_train,Y_train,validation_split=0.10,epochs=13,callbacks=[early_spotting_monitor])  # starts training

return model
like image 533
Nicolas Avatar asked Dec 12 '17 11:12

Nicolas


People also ask

Can we use different activation functions in neural networks?

A neural network is just a (big) mathematical function. You could even use different activation functions for different neurons in the same layer. Different activation functions allow for different non-linearities which might work better for solving a specific function.

Can we use different activation function in different layers?

📢 Note: All hidden layers usually use the same activation function. However, the output layer will typically use a different activation function from the hidden layers. The choice depends on the goal or type of prediction made by the model.

Why are we using 512 in the dense layer?

Whenever we say Dense(512, activation='relu', input_shape=(32, 32, 3)) , what we are really saying is Perform matrix multiplication to result in an output matrix with a desired last dimension to be 512.

Is ReLU better than sigmoid?

The model trained with ReLU converged quickly and thus takes much less time when compared to models trained on the Sigmoid function. We can clearly see overfitting in the model trained with ReLU. This is due to the quick convergence. The model performance is significantly better when trained with ReLU.


2 Answers

You have to use the Functional API to do this, for example:

input = Input(shape = (X_train.shape[1]))
branchA = Dense(neuronsA, activation = "relu")(input)
branchB = Dense(neuronsB, activation = "sigmoid")(input)

out = concatenate([branchA, branchB])

You cannot do it with the Sequential API, so I recommend you move your code to the functional API.

like image 173
Dr. Snoopy Avatar answered Sep 20 '22 13:09

Dr. Snoopy


So this is something I have been trying to do recently and so far this is what I have done. I think it's working, but I would appreciate if anyone tells me what I'm doing wrong here. I'm doing this only on the output layer and my output layer has two units:

def activations(l):
    l_0 = tf.keras.activations.exponential(l[...,0])
    l_1 = tf.keras.activations.elu(l[...,1])
    lnew = tf.stack([l_0, l_1], axis = 1)
    return lnew

model = tf.keras.Sequential([..., Dense(2, activation = activations)])
like image 41
armen Avatar answered Sep 17 '22 13:09

armen