How is Hard Sigmoid defined

Tags:

I am working on Deep Nets using keras. There is an activation "hard sigmoid". Whats its mathematical definition ?

I know what is Sigmoid. Someone asked similar question on Quora: https://www.quora.com/What-is-hard-sigmoid-in-artificial-neural-networks-Why-is-it-faster-than-standard-sigmoid-Are-there-any-disadvantages-over-the-standard-sigmoid

But I could not find the precise mathematical definition anywhere ?

308

asked Feb 15 '16 13:02

Anuj Gupta

Video Answer

3 Answers

Since Keras supports both Tensorflow and Theano, the exact implementation might be different for each backend - I'll cover Theano only. For Theano backend Keras uses T.nnet.hard_sigmoid, which is in turn linearly approximated standard sigmoid:

slope = tensor.constant(0.2, dtype=out_dtype)
shift = tensor.constant(0.5, dtype=out_dtype)
x = (x * slope) + shift
x = tensor.clip(x, 0, 1)

i.e. it is: max(0, min(1, x*0.2 + 0.5))

136

answered Oct 17 '22 23:10

Serj Zaharchenko

The hard sigmoid is normally a piecewise linear approximation of the logistic sigmoid function. Depending on what properties of the original sigmoid you want to keep, you can use a different approximation.

I personally like to keep the function correct at zero, i.e. σ(0) = 0.5 (shift) and σ'(0) = 0.25 (slope). This could be coded as follows

def hard_sigmoid(x):
    return np.maximum(0, np.minimum(1, (x + 2) / 4))

answered Oct 17 '22 23:10

Mr Tsjolder

For reference, the hard sigmoid function may be defined differently in different places. In Courbariaux et al. 2016 [1] it's defined as:

σ is the “hard sigmoid” function: σ(x) = clip((x + 1)/2, 0, 1) = max(0, min(1, (x + 1)/2))

The intent is to provide a probability value (hence constraining it to be between 0 and 1) for use in stochastic binarization of neural network parameters (e.g. weight, activation, gradient). You use the probability p = σ(x) returned from the hard sigmoid function to set the parameter x to +1 with p probability, or -1 with probability 1-p.

[1] https://arxiv.org/abs/1602.02830 - "Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1", Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio, (Submitted on 9 Feb 2016 (v1), last revised 17 Mar 2016 (this version, v3))

answered Oct 17 '22 23:10

phoenixdown

Related questions
                            
                                Saturated addition of two signed Java 'long' values
                            
                                Sparse constrained linear least-squares solver
                            
                                Question about the implementation of Bezier Curves?
                            
                                Javascript math parser library
                            
                                Implementing the Spigot algorithm for π (pi)
                            
                                Overflow and range checking for arithmetic operations
                            
                                Using dot product to determine if point lies on a plane
                            
                                Closest points on 2d segments, passing through third 2d segment
                            
                                How to create a random number following a lognormal distribution in excel?
                            
                                Algorithm to Calculate the Number of Lattice Points in a Polygon
                            
                                Possible Combination of Parentheses in a Matrix Chain Application
                            
                                Converting a decimal to a mixed-radix (base) number
                            
                                How to increase a decimal's smallest fractional part by one?
                            
                                C++: hiding some functions
                            
                                Method to detect intersection between a rectangle and a polygon?
                            
                                Negative exponent with NumPy array operand
                            
                                Quaternion to Euler angles algorithm - How to convert to 'Y = Up' and between handedness?
                            
                                How can I detect if a float has a repeating decimal expansion in C#?
                            
                                GLSL refract function explanation available?
                            
                                How does this code find the number of trailing zeros from any base number factorial?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How is Hard Sigmoid defined

Tags:

math

tensorflow

deep-learning

keras

theano

Anuj Gupta

People also ask

Video Answer

3 Answers

Serj Zaharchenko

Mr Tsjolder

phoenixdown

Recent Activity

Donate For Us