For a scalar variable x
, we know how to write down a numerically stable sigmoid function in python:
def sigmoid(x):
if x >= 0:
return 1. / ( 1. + np.exp(-x) )
else:
return exp(x) / ( 1. + np.exp(x) )
For a list of scalars, say z = [x_1, x_2, x_3, ...]
, and suppose we don't know the sign of each x_i
beforehand, we could generalize the above definition and try:
def sigmoid(z):
result = []
for x in z:
if x >= 0:
result.append(1. / ( 1. + np.exp(-x) ) )
else:
result.append( exp(x) / ( 1. + np.exp(x) ) )
return result
This seems to work. However, I feel this is perhaps not the most pythonic way. How should I improve the definition in terms of 'cleanness'? Say, is there a way to use comprehension to shorten the function definition?
I'm sorry if this has been asked, because I cannot find similar questions on SO. Thank you very much for your time and help!
Implement sigmoid function using Numpy Last Updated :03 Oct, 2019 With the help of Sigmoidactivation function, we are able to reduce the loss during the time of training because it eliminates the gradient problem in machine learning model while training. # Import matplotlib, numpy and math importmatplotlib.pyplot as plt importnumpy as np importmath
We can define the logistic sigmoid function in Python as follows: (You can also find the Python code in example 1 .) Here, the def keyword indicates that we’re defining a new Python function. We’ve named the function “ logistic_sigmoid ” (although we could name it something else). The input value is called x.
Sigmoid function is used for squishing the range of values into a range (0, 1). There are multiple other function which can do that, but a very important point boosting its popularity is how simply it can express its derivatives, which comes handy in backpropagation
def sigmoid (Z): """ Implements the sigmoid activation in bumpy Arguments: Z -- numpy array of any shape Returns: A -- output of sigmoid (z), same shape as Z cache -- returns Z, useful during backpropagation """ cache=Z print (type (Z)) print (Z) A=1/ (1+ (np.exp ( (-Z)))) return A, cache
You are right, you can do better by using np.where
, the numpy equivalent of if
:
def sigmoid(x):
return np.where(x >= 0,
1 / (1 + np.exp(-x)),
np.exp(x) / (1 + np.exp(x)))
This function takes a numpy array x
and returns a numpy array, too:
data = np.arange(-5,5)
sigmoid(data)
#array([0.00669285, 0.01798621, 0.04742587, 0.11920292, 0.26894142,
# 0.5 , 0.73105858, 0.88079708, 0.95257413, 0.98201379])
Fully correct answer (no warnings) was provided by @hao peng but solution wasn't explained clearly. This would be too long for a comment, so I'll go for an answer.
Let's start with analysis of a few answers (pure numpy
answers only):
This one is correct mathematically but still gives us a warning. Let's look at the code:
def sigmoid(x):
return np.where(
x >= 0, # condition
1 / (1 + np.exp(-x)), # For positive values
np.exp(x) / (1 + np.exp(x)) # For negative values
)
As both branches are evaluated (they are arguments, they have to be), the first branch will give us a warning for negative values and the second for positive.
Although the warnings will be raised, results from overflows will not be incorporated, hence the result is correct.
This one is almost correct, BUT will work only on floating point values, see below:
def sigmoid(x):
return np.piecewise(
x,
[x > 0],
[lambda i: 1 / (1 + np.exp(-i)), lambda i: np.exp(i) / (1 + np.exp(i))],
)
sigmoid(np.array([0.0, 1.0])) # [0.5 0.73105858] correct
sigmoid(np.array([0, 1])) # [0, 0] incorrect
Why? Longer answer was provided by @mhawke in another thread, but the main point is:
It seems that piecewise() converts the return values to the same type as the input so, when an integer is input an integer conversion is performed on the result, which is then returned.
Idea of stable sigmoid comes from the fact that:
Both versions are equally efficient in terms of operations if coded correctly (one exp
evaluation is enough). Now:
e^x
will overflow when x
is positivee^-x
will overflow when x
is negativeHence we have to branch on x
equal to zero. Using numpy
's masking we can transform only the part of array which is positive or negative with specific sigmoid implementations.
See code comments for additional points:
def _positive_sigmoid(x):
return 1 / (1 + np.exp(-x))
def _negative_sigmoid(x):
# Cache exp so you won't have to calculate it twice
exp = np.exp(x)
return exp / (exp + 1)
def sigmoid(x):
positive = x >= 0
# Boolean array inversion is faster than another comparison
negative = ~positive
# empty contains junk hence will be faster to allocate
# Zeros has to zero-out the array after allocation, no need for that
# See comment to the answer when it comes to dtype
result = np.empty_like(x, dtype=np.float)
result[positive] = _positive_sigmoid(x[positive])
result[negative] = _negative_sigmoid(x[negative])
return result
Results (50 times case test from ynn
):
289.5070939064026 #DYZ
222.49267292022705 #ynn
230.81086134910583 #this
Indeed piecewise seems faster (not sure about the reasons, maybe masking and additional masking ops make it slower).
Code below was used:
import time
import numpy as np
def _positive_sigmoid(x):
return 1 / (1 + np.exp(-x))
def _negative_sigmoid(x):
# Cache exp so you won't have to calculate it twice
exp = np.exp(x)
return exp / (exp + 1)
def sigmoid(x):
positive = x >= 0
# Boolean array inversion is faster than another comparison
negative = ~positive
# empty contains juke hence will be faster to allocate than zeros
result = np.empty_like(x)
result[positive] = _positive_sigmoid(x[positive])
result[negative] = _negative_sigmoid(x[negative])
return result
N = int(1e4)
x = np.random.uniform(size=(N, N))
start: float = time.time()
for _ in range(50):
y1 = np.where(x > 0, 1 / (1 + np.exp(-x)), np.exp(x) / (1 + np.exp(x)))
y1 += 1
end: float = time.time()
print(end - start)
start: float = time.time()
for _ in range(50):
y2 = np.piecewise(
x,
[x > 0],
[lambda i: 1 / (1 + np.exp(-i)), lambda i: np.exp(i) / (1 + np.exp(i))],
)
y2 += 1
end: float = time.time()
print(end - start)
start: float = time.time()
for _ in range(50):
y2 = sigmoid(x)
y2 += 1
end: float = time.time()
print(end - start)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With