Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Divide by zero encountered in log" when not dividing by zero

Tags:

python

numpy

When I do:

summing += yval * np.log(sigmoid(np.dot(w.transpose(), xi.transpose()))) + (1-yval)* np.log(1-sigmoid(np.dot(w.transpose(), xi.transpose())))

where there is no division, why do I get a "divide by zero encountered in log" error? As a result, summing becomes [nan].

like image 835
Jobs Avatar asked Mar 25 '16 22:03

Jobs


3 Answers

That's the warning you get when you try to evaluate log with 0:

>>> import numpy as np
>>> np.log(0)
__main__:1: RuntimeWarning: divide by zero encountered in log

I agree it's not very clear.

So in your case, I would check why your input to log is 0.

PS: this is on numpy 1.10.4

like image 101
DevShark Avatar answered Oct 24 '22 13:10

DevShark


I had this same problem. It looks like you're trying to do logistic regression. I was doing MULTI-CLASS Classification with logistic regression. But you need to solve this problem using the ONE VS ALL approach (google for details).

If you don't set your yval variable so that only has '1' and '0' instead of yval = [1,2,3,4,...] etc., then you will get negative costs which lead to runaway theta and then lead to you reaching the limit of log(y) where y is close to zero.

The fix should be to pre-treat your yval variable so that it only has '1' and '0' for positive and negative examples.

like image 19
Seth Avatar answered Oct 24 '22 13:10

Seth


Even though it's late, this answer might help someone else.

In the part of your code.

... + (1-yval)* np.log(1-sigmoid(np.dot(w.transpose(), xi.transpose())))

may be the np.dot(w.transpose(), xi.transpose()) function is spitting larger values(above 40 or so), resulting in the output of sigmoid( ) to be 1. And then you're basically taking np.log of 1-1 that is 0. And as DevShark has mentioned above, it causes the RuntimeWarning: Divide by zero... error.

How I came up with the number 40 you might ask, well, it's just that for values above 40 or so sigmoid function in python(numpy) returns 1..

Looking at your implementation, it seems you're dealing with the Logistic Regression algorithm, in which case(I'm under the impression that) feature scaling is very important.

Since I'm writing answer for the first time, It is possible I may have violated some rules/regulations, if that is the case I'd like to apologise.

like image 11
Mayur Avatar answered Oct 24 '22 13:10

Mayur