Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logsoftmax stability

I know how to make softmax stable by adding to element -max _i x_i. This avoids overflow and underflow. Now, taking log of this can cause underflow. log softmax(x) can evaluate to zero, leading to -infinity.

I am not sure how to fix it. I know this is a common problem. I read several answers on it, which I didn't understand. But I am still confused on how to solve this problem.

PS: If you provide a simple example, it would be awesome.

like image 961
Abhishek Bhatia Avatar asked May 20 '17 01:05

Abhishek Bhatia


2 Answers

In order to stabilize Logsoftmax, most implementations such as Tensorflow and Thenao, use a trick which takes out the largest component max(x_i). This trick is often used for stably computing softmax. For logsoftmax, we begin with:

formula

After extracting out the exp(b) and using the fact that log(exp(x)) = x, we have:

formula

If we set b = max(x_i), this new equation has both overflow and underflow stability conditions.


In terms of code, if x is a vector:

def log_softmax(x):
    x_off = x - np.max(x)
    return x_off - np.log(np.sum(np.exp(x_off)))

See also: https://timvieira.github.io/blog/post/2014/02/11/exp-normalize-trick/

like image 114
e3oroush Avatar answered Sep 25 '22 00:09

e3oroush


logsoftmax = logits - log(reduce_sum(exp(logits), dim))

refer: https://www.tensorflow.org/api_docs/python/tf/nn/log_softmax

like image 39
Nemo Avatar answered Sep 23 '22 00:09

Nemo