When setting up a Neural Network using Keras you can use either the Sequential
model, or the Functional API
. My understanding is the the former is easy to set up and manage, and operates as a linear stack of layers, and that the functional approach is useful for more complex architectures, particularly those which involve sharing the output of an internal layer. I personally like using the functional API for versatility, however, am having difficulties with advanced activation layers such as LeakyReLU. When using standard activations, in the sequential model one can write:
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
Similarly in the functional API one can write the above as:
inpt = Input(shape = (100,))
dense_1 = Dense(32, activation ='relu')(inpt)
out = Dense(10, activation ='softmax')(dense_2)
model = Model(inpt,out)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
However, when using advanced activations like LeakyReLU and PReLU, in that sequential model we write them as separate layers. For example:
model = Sequential()
model.add(Dense(32, input_dim=100))
model.add(LeakyReLU(alpha=0.1))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
Now, I'm assuming one does the equivalent in the functional API approach:
inpt = Input(shape = (100,))
dense_1 = Dense(32)(inpt)
LR = LeakyReLU(alpha=0.1)(dense_1)
out = Dense(10, activation ='softmax')(LR)
model = Model(inpt,out)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
My questions are:
'relu'
?'relu'
), as I realise you could always write your activation functions, including standard ones, as new layers, although have read that that should be avoided? No, you forgot to connect the LeakyReLU to the dense layer:
LR = LeakyReLU(alpha=0.1)(dense_1)
Usually the advanced activations have tunable or learnable parameters, and these have to stored somewhere, it makes more sense for them to be layers as you can then access and save these parameters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With