Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent float imprecision from affecting numpy.arange?

Because numpy.arange() uses ceil((stop - start)/step) to determine the number of items, a small float imprecision (stop = .400000001) can add an unintended value to the list.

Example

The first case does not include the stop point (intended)

>>> print(np.arange(.1,.3,.1))
[0.1 0.2]

The second case includes the stop point (not intended)

>>> print(np.arange(.1,.4,.1))
[0.1 0.2 0.3 0.4]

numpy.linspace() fixes this problem, np.linspace(.1,.4-.1,3). but requires you know the number of steps. np.linspace(start,stop-step,np.ceil((stop-step)/step)) leads to the same incosistencies.

Question

How can I generate a reliable float range without knowing the # of elements in the range?

Extreme Case

Consider the case in which I want generate a float index of unknown precision

np.arange(2.00(...)001,2.00(...)021,.00(...)001)
like image 734
Brendan Frick Avatar asked Feb 12 '18 21:02

Brendan Frick


People also ask

Is NumPy arange inclusive?

The np. arange([start,] stop[, step]) function creates a new NumPy array with evenly-spaced integers between start (inclusive) and stop (exclusive). The step size defines the difference between subsequent values.

What does NP arange () do?

arange() NumPy arange() is one of the array creation routines based on numerical ranges. It creates an instance of ndarray with evenly spaced values and returns the reference to it.

How do I create a range in NumPy?

arange() function. The arange() function is used to get evenly spaced values within a given interval. Values are generated within the half-open interval [start, stop]. For integer arguments the function is equivalent to the Python built-in range function, but returns an ndarray rather than a list.


1 Answers

Your goal is to calculate what ceil((stop - start)/step) would be if the values had been calculated with exact mathematics.

This is impossible to do given only floating-point values of start, stop, and step that are the results of operations in which some rounding errors may have occurred. Rounding removes information, and there is simply no way to create information from lack of information.

Therefore, this problem is only solvable if you have additional information about start, stop, and step.

Suppose step is exact, but start and stop have some accumulated errors bounded by e0 and e1. That is, you know start is at most e0 away from its ideal mathematical value (in either direction), and stop is at most e1 away from its ideal value (in either direction). Then the ideal value of (stop-start)/step could range from (stop-start-e0-e1)/step to (stop-start+e0+e1)/step away from its ideal value.

Suppose there is an integer between (stop-start-e0-e1)/step to (stop-start+e0+e1)/step. Then it is impossible to know whether the ideal ceil result should be the lesser integer or the greater just from the floating-point values of start, stop, and step and the bounds e0 and e1.

However, from the examples you have given, the ideal (stop-start)/step could be exactly an integer, as in (.4-.1)/.1. If so, any non-zero error bounds could result in the error interval straddling an integer, making the problem impossible to solve from the information we have so far.

Therefore, in order to solve the problem, you must have more information than just simple bounds on the errors. You must know, for example, that (stop-start)/step is exactly an integer or is otherwise quantized. For example, if you knew that the ideal calculation of the number of steps would produce a multiple of .1, such as 3.8, 3.9, 4.0, 4.1, or 4.2, but never 4.05, and the errors were sufficiently small that the floating-point calculation (stop-start)/step had a final error less than .05, then it would be possible to round (stop-start)/step to the nearest qualifying multiple and then to apply ceil to that.

If you have such information, you can update the question with what you know about the errors in start, stop, and step (e.g., perhaps each of them is the result of a single conversion from decimal to floating-point) and the possible values of the ideal (stop-start)/step. If you do not have such information, there is no solution.

like image 102
Eric Postpischil Avatar answered Nov 03 '22 13:11

Eric Postpischil