Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Advanced Activation layers in Keras Functional API

When setting up a Neural Network using Keras you can use either the Sequential model, or the Functional API. My understanding is the the former is easy to set up and manage, and operates as a linear stack of layers, and that the functional approach is useful for more complex architectures, particularly those which involve sharing the output of an internal layer. I personally like using the functional API for versatility, however, am having difficulties with advanced activation layers such as LeakyReLU. When using standard activations, in the sequential model one can write:

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
          loss='categorical_crossentropy',
          metrics=['accuracy'])

Similarly in the functional API one can write the above as:

inpt = Input(shape = (100,))
dense_1 = Dense(32, activation ='relu')(inpt)
out = Dense(10, activation ='softmax')(dense_2)
model = Model(inpt,out)
model.compile(optimizer='rmsprop',
          loss='categorical_crossentropy',
          metrics=['accuracy'])

However, when using advanced activations like LeakyReLU and PReLU, in that sequential model we write them as separate layers. For example:

model = Sequential()
model.add(Dense(32, input_dim=100))
model.add(LeakyReLU(alpha=0.1))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
          loss='categorical_crossentropy',
          metrics=['accuracy'])

Now, I'm assuming one does the equivalent in the functional API approach:

inpt = Input(shape = (100,))
dense_1 = Dense(32)(inpt)
LR = LeakyReLU(alpha=0.1)(dense_1)
out = Dense(10, activation ='softmax')(LR)
model = Model(inpt,out)
model.compile(optimizer='rmsprop',
          loss='categorical_crossentropy',
          metrics=['accuracy'])

My questions are:

  1. Is this correct syntax in the functional approach?
  2. Why does Keras require a new layer for these advanced activation functions rather than allowing us to just replace 'relu'?
  3. Is there something fundamentally different about creating a new layer for the activation function, rather than assigning it to an existing layer definition (as in the first examples where we wrote 'relu'), as I realise you could always write your activation functions, including standard ones, as new layers, although have read that that should be avoided?
like image 778
Joseph Bullock Avatar asked Apr 16 '18 21:04

Joseph Bullock


1 Answers

  1. No, you forgot to connect the LeakyReLU to the dense layer:

    LR = LeakyReLU(alpha=0.1)(dense_1)

  2. Usually the advanced activations have tunable or learnable parameters, and these have to stored somewhere, it makes more sense for them to be layers as you can then access and save these parameters.

  3. Do it only if there is an advantage, such as tunable parameters.
like image 84
Dr. Snoopy Avatar answered Nov 14 '22 23:11

Dr. Snoopy