Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing activation function of a keras layer w/o replacing whole layer

I am trying to change the activation function of the last layer of a keras model without replacing the whole layer. In this case, only the softmax function

import keras.backend as K
from keras.models import load_model
from keras.preprocessing.image import load_img, img_to_array
import numpy as np

model = load_model(model_path)  # Load any model
img = load_img(img_path, target_size=(224, 224))
img = img_to_array(img)
print(model.predict(img))

My output:

array([[1.53172877e-07, 7.13159451e-08, 6.18941920e-09, 8.52070968e-07,
    1.25813088e-07, 9.98970985e-01, 1.48254022e-08, 6.09538893e-06,
    1.16236095e-07, 3.91888688e-10, 6.29304608e-08, 1.79565995e-09,
    1.75571788e-08, 1.02110009e-03, 2.14380114e-09, 9.54465733e-08,
    1.05938483e-07, 2.20544337e-07]], dtype=float32)

Then I do this to change the activation:

model.layers[-1].activation = custom_softmax
print(model.predict(test_img))

and the output I got is exactly the same. Any ideas how to fix? Thanks!

You could try to use the custom_softmax below:

def custom_softmax(x, axis=-1):
"""Softmax activation function.
# Arguments
    x : Tensor.
    axis: Integer, axis along which the softmax normalization is applied.
# Returns
    Tensor, output of softmax transformation.
# Raises
    ValueError: In case `dim(x) == 1`.
"""
ndim = K.ndim(x)
if ndim >= 2:
    return K.zeros_like(x)
else:
    raise ValueError('Cannot apply softmax to a tensor that is 1D')
like image 654
Hardian Lawi Avatar asked Mar 24 '18 12:03

Hardian Lawi


People also ask

Can we use different activation function in different layers?

šŸ“¢ Note: All hidden layers usually use the same activation function. However, the output layer will typically use a different activation function from the hidden layers. The choice depends on the goal or type of prediction made by the model.

What is activation =' ReLU in Keras?

relu function Applies the rectified linear unit activation function. With default values, this returns the standard ReLU activation: max(x, 0) , the element-wise maximum of 0 and the input tensor.

What is the activation function of fully connected layer?

What is the most common activation function in a fully connected layer in a deep CNN ? Fully connected input layer (flatten)ā”takes the output of the previous layers, ā€œflattensā€ them and turns them into a single vector that can be an input for the next stage.


1 Answers

At the current state of things there's no official, clean way to do that. As pointed by @layser in the comments, the Tensorflow graph isn't being updated - which results in the lack of change in your output. One option is to use keras-vis' utils. My recommendation is to isolate that in your own utils.py, like so:

from vis.utils.utils import apply_modifications

def update_layer_activation(model, activation, index=-1):
    model.layers[index].activation = activation
    return apply_modifications(model)

Which would lead to a similar use:

model = update_layer_activation(model, custom_softmax)

If you follow the given link, you'll see what they do is quite simple: they save the model to a temporary path, then load it back and return, finally deleting the temp file.

like image 195
Julio Cezar Silva Avatar answered Oct 10 '22 00:10

Julio Cezar Silva