Question motivation:
In standard numerical languages of which I am aware (e.g. Matlab, Python numpy, etc.), if, for example, you take the exponential of a modestly large number, the output is infinity as the result of numerical overflow. If this is multiplied by 0, you get NaN. Separately, these steps are reasonable enough, but they reveal a logical mistake in the implementation of the math. The first number resulting from overflow is known to be finite and we clearly want the result of a multiplication by 0 with this large finite number to be 0.
Explicitly:
>>> import numpy as np
>>> np.exp(709)*0
0.0
>>> np.exp(710)*0
nan
I imagine we could here introduce a notion of the "largest finite value" (LFV) which would have the following properties:
On the other hand, infinity should not simply be redefined in the way described for LFV. It does not make sense for 0 *infinity = 0...appropriately, the current standard implementation of infinity yields NaN in this setting. Also, sometimes there arises a need to initialize numbers to infinity and you'd want the result of any numerical operation, even one that yielded LFV to be strictly less than the initialized value (this is convenient for some logical statements). I'm sure other situations exist where a proper infinity is necessary -- my point is simply that infinity should not simply be redefined to have some of the LFV properties above.
The Question:
I want to know if there is any language which uses a scheme like this and if there are any problems with such a scheme. This problem does not arise in proper math since there aren't these numerical limits on the size of numbers, but I think it is a real problem when implementing a consistent mathematics in programming languages. Essentially, by LFV, I think I want a shorthand for the open interval between the largest explicit value and infinity LFV = (LEV,infinity), but maybe this intuition is wrong.
Update: In the comments, people seem to be objecting a little to the utility of issue I'm bringing up. My question arises not because many related issues occur, but rather because the same issue frequently arises in many different settings. From talking to people who do data analysis, this is something that often enough contributes to runtime errors when training/fitting models. The question is basically why this isn't handled by numerical languages. From the comments, I'm essentially gathering that the people who write the languages don't see the utility of handling things this way. In my opinion, when certain specific issues occur frequently enough for people using a language, it might make sense to handle those exceptions in a principled way so each user doesn't have to.
Googolplexian+1 is the largest finite number.
Infinity is not a number, but if it were, it would be the largest number. Of course, such a largest number does not exist in a strict sense: if some number n n n were the largest number, then n + 1 n+1 n+1 would be even larger, leading to a contradiction. Hence infinity is a concept rather than a number.
There is a crucial difference between the size assigned to a given number and the size of a set of numbers. The numbers 1 and 2 have finite values. But whether there exist zero or infinitely many other numbers between them is a totally different question. What happens when we add any number to infinity?
So... I got curious and dug around a little.
As I already mentioned in the comments, a "largest finite value" kind of exists in IEEE 754, if you consider the exception status flags. A value of infinity with the overflow flag set corresponds to your proposed LFV, with the difference that the flag is only available to be read out after the operation, instead of being stored as part of the value itself. Which means you have to manually check the flag and act if overflow occurs, instead of just having LFV*0 = 0 built in.
There is quite an interesting paper on exception handling and its support in programming languages. Quote:
The IEEE 754 model of setting a flag and returning an infinity or a quiet NaN assumes that the user tests the status frequently (or at least appropriately.) Diagnosing what the original problem was requires the user to check all results for exceptional values, which in turn assumes that they are percolated through all operations, so that erroneous data can be flagged. Given these assumptions, everything should work, but unfortunately they are not very realistic.
The paper also bemoans the poor support for floating point exception handling, especially in C99 and Java (I'm sure most other languages aren't better though). Given that in spite of this, there are no major efforts to fix this or create a better standard, to me seems to indicate that IEEE 754 and its support are, in a sense, "good enough" (more on that later).
Let me give a solution to your example problem to demonstrate something. I'm using numpy's seterr
to make it raise an exception on overflow:
import numpy as np
def exp_then_mult_naive(a, b):
err = np.seterr(all='ignore')
x = np.exp(a) * b
np.seterr(**err)
return x
def exp_then_mult_check_zero(a, b):
err = np.seterr(all='ignore', over='raise')
try:
x = np.exp(a)
return x * b
except FloatingPointError:
if b == 0:
return 0
else:
return exp_then_mult_naive(a, b)
finally:
np.seterr(**err)
def exp_then_mult_scaling(a, b):
err = np.seterr(all='ignore', over='raise')
e = np.exp(1)
while abs(b) < 1:
try:
x = np.exp(a) * b
break
except FloatingPointError:
a -= 1
b *= e
else:
x = exp_then_mult_naive(a, b)
np.seterr(**err)
return x
large = np.float_(710)
tiny = np.float_(0.01)
zero = np.float_(0.0)
print('naive: e**710 * 0 = {}'.format(exp_then_mult_naive(large, zero)))
print('check zero: e**710 * 0 = {}'
.format(exp_then_mult_check_zero(large, zero)))
print('check zero: e**710 * 0.01 = {}'
.format(exp_then_mult_check_zero(large, tiny)))
print('scaling: e**710 * 0.01 = {}'.format(exp_then_mult_scaling(large, tiny)))
# output:
# naive: e**710 * 0 = nan
# check zero: e**710 * 0 = 0
# check zero: e**710 * 0.01 = inf
# scaling: e**710 * 0.01 = 2.233994766161711e+306
exp_then_mult_naive
does what you did: expression that will overflow multiplied by 0
and you get a nan
.exp_then_mult_check_zero
catches the overflow and returns 0
if the second argument is 0
, otherwise same as the naive version (note that inf * 0 == nan
while inf * positive_value == inf
). This is the best you could do if there were a LFV constant.exp_then_mult_scaling
uses information about the problem to get results for inputs the other two couldn't deal with: if b
is small, we can multiply it by e while decrementing a
without changing the result. So if np.exp(a) < np.inf
before b >= 1
, the result fits. (I know I could check if it fits in one step instead of using the loop, but this was easier to write right now.)So now you have a situation where a solution that doesn't require an LFV is able to provide correct results for more input pairs than one that does. The only advantage an LFV has here is using fewer lines of code while still giving a correct result in that one particular case.
By the way, I'm not sure about thread safety with seterr
. So if you're using it in multiple threads with different settings in each thread, test it out before to avoid headache later.
Bonus factoid: the original standard actually stipulated that you should be able to register a trap handler that would, on overflow, be given the result of the operation divided by a large number (see section 7.3). That would allow you to carry on the computation, as long as you keep in mind that the value is actually much larger. Although I guess it could become a minefield of WTF in a multithreaded environment, never mind that I didn't really find support for it.
To get back to the "good enough" point from above: In my understanding, IEEE 754 was designed as a general purpose format, usable for practically any application. When you say "the same issue frequently arises in many different settings", it is (or at least was) apparently not frequently enough to justify inflating the standard.
Let me quote from the Wikipedia article:
[...] the more esoteric features of the IEEE 754 standard discussed here, such as extended formats, NaN, infinities, subnormals etc. [...] are designed to give safe robust defaults for numerically unsophisticated programmers, in addition to supporting sophisticated numerical libraries by experts.
Putting aside that, in my opinion, even having NaN as a special value is a bit of a dubious decision, adding an LFV isn't really going to make it easier or safer for the "numerically unsophisticated", and doesn't allow experts to do anything they couldn't already.
I guess the bottom line is that representing rational numbers is hard. IEEE 754 does a pretty good job of making it simple for a lot of applications. If yours isn't one of them, in the end you'll just have to deal with the hard stuff by either
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With