Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoid underflow using exp and minimum positive float128 in numpy

I am trying to calculate the following ratio: w(i) / (sum(w(j)) where w are updated using an exponential decreasing function, i.e. w(i) = w(i) * exp(-k), k being a positive parameter. All the numbers are non-negative. This ratio is then used to a formula (multiply with a constant and add another constant). As expected, I soon run into underflow problems.

I guess this happens often but can someone give me some references on how to deal with this? I did not find an appropriate transformation so one thing I tried to do is set some minimum positive number as a safety threshold but I did not manage to find which is the minimum positive float (I am representing numbers in numpy.float128). How can I actually get the minimum positive such number on my machine? The code looks like this:

w = np.ones(n, dtype='float128')
lt = np.ones(n)
for t in range(T):
    p = (1-k) * w / w.sum() + (k/n)
    # Process a subset of the n elements, call it set I, j is some range()
    for i in I: 
        s = p[list(j[i])].sum()
        lt /= s
        w[s] *= np.exp(-k * lt)

where k is some constant in (0,1) and n is the length of the array

like image 862
user90772 Avatar asked Oct 19 '22 23:10

user90772


1 Answers

When working with exponentially small numbers it's usually better to work in log space. For example, log(w*exp(-k)) = log(w) - k, which won't have any over/underflow problems unless k is itself exponentially large or w is zero. And, if w is zero, numpy will correctly return -inf. Then, when doing the sum, you factor out the largest term:

log_w = np.log(w) - k
max_log_w = np.max(log_w)
# Individual terms in the following may underflow, but then they wouldn't
# contribute to the sum anyways.
log_sum_w = max_log_w + np.log(np.sum(np.exp(log_w - max_log_w)))
log_ratio = log_w - log_sum_w

This probably isn't exactly what you want since you could just factor out the k completely (assuming it's a constant and not an array), but it should get you on your way.

Scikit-learn implements a similar thing with extmath.logsumexp, but it's basically the same as the above.

like image 106
clwainwright Avatar answered Oct 21 '22 23:10

clwainwright