I am trying to change the activation function of the last layer of a keras model without replacing the whole layer. In this case, only the softmax function
import keras.backend as K
from keras.models import load_model
from keras.preprocessing.image import load_img, img_to_array
import numpy as np
model = load_model(model_path) # Load any model
img = load_img(img_path, target_size=(224, 224))
img = img_to_array(img)
print(model.predict(img))
My output:
array([[1.53172877e-07, 7.13159451e-08, 6.18941920e-09, 8.52070968e-07,
1.25813088e-07, 9.98970985e-01, 1.48254022e-08, 6.09538893e-06,
1.16236095e-07, 3.91888688e-10, 6.29304608e-08, 1.79565995e-09,
1.75571788e-08, 1.02110009e-03, 2.14380114e-09, 9.54465733e-08,
1.05938483e-07, 2.20544337e-07]], dtype=float32)
Then I do this to change the activation:
model.layers[-1].activation = custom_softmax
print(model.predict(test_img))
and the output I got is exactly the same. Any ideas how to fix? Thanks!
You could try to use the custom_softmax
below:
def custom_softmax(x, axis=-1):
"""Softmax activation function.
# Arguments
x : Tensor.
axis: Integer, axis along which the softmax normalization is applied.
# Returns
Tensor, output of softmax transformation.
# Raises
ValueError: In case `dim(x) == 1`.
"""
ndim = K.ndim(x)
if ndim >= 2:
return K.zeros_like(x)
else:
raise ValueError('Cannot apply softmax to a tensor that is 1D')
š¢ Note: All hidden layers usually use the same activation function. However, the output layer will typically use a different activation function from the hidden layers. The choice depends on the goal or type of prediction made by the model.
relu function Applies the rectified linear unit activation function. With default values, this returns the standard ReLU activation: max(x, 0) , the element-wise maximum of 0 and the input tensor.
What is the most common activation function in a fully connected layer in a deep CNN ? Fully connected input layer (flatten)ātakes the output of the previous layers, āflattensā them and turns them into a single vector that can be an input for the next stage.
At the current state of things there's no official, clean way to do that. As pointed by @layser in the comments, the Tensorflow graph isn't being updated - which results in the lack of change in your output. One option is to use keras-vis' utils
. My recommendation is to isolate that in your own utils.py
, like so:
from vis.utils.utils import apply_modifications
def update_layer_activation(model, activation, index=-1):
model.layers[index].activation = activation
return apply_modifications(model)
Which would lead to a similar use:
model = update_layer_activation(model, custom_softmax)
If you follow the given link, you'll see what they do is quite simple: they save the model
to a temporary path, then load it back and return, finally deleting the temp file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With