Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does numpy integer subtraction produce a float64?

In numpy, why does subtraction of integers sometimes produce floating point numbers?

>>> x = np.int64(2) - np.uint64(1)
>>> x
1.0
>>> x.dtype
dtype('float64')

This seems to only occur when using multiple different integer types (e.g. signed and unsigned), and when no larger integer type is available.

like image 319
benjimin Avatar asked Jun 30 '17 05:06

benjimin


People also ask

Is float64 an integer?

float64' object cannot be interpreted as an integer.

How does NumPy subtract work?

A Quick Introduction to Numpy Subtract When you use np. subtract on two same-sized Numpy arrays, the function will subtract the elements of the second array from the elements of the first array. It performs this subtraction in an “element-wise” fashion.

How do you subtract a number from a NumPy array?

The most straightforward way to subtract two matrices in NumPy is by using the - operator, which is the simplification of the np. subtract() method - NumPy specific method designed for subtracting arrays and other array-like objects such as matrices.


2 Answers

This is a conscious design decision by the numpy authors. When deciding on the resulting type, only the types of the operands are considered, not their actual values. And for the operation you perform, there is a risk of having a result outside the valid range, e.g. if you subtract a very large uint64 number, the result would not fit in an int64. The safe selection is thus to convert to float64, which certainly will fit the result (possibly with reduced precision, though).

Compare with an example of x = np.int32(2) - np.uint32(1). This can always be safely represented as an int64, therefore that type is chosen. The same would be true for x = np.int64(2) - np.uint32(1). This will also yield an int64.

The alternative would be to follow e.g. the c rules, which would cast everything to uint64. But that could, of course, lead to very strange results with over/underflows.

If you want to know ahead of time what type you will end up with, look into np.result_type(), np.can_cast(), and np.promote_types(). Reading about this in the docs might also help you understand the issue a bit better.

like image 94
JohanL Avatar answered Sep 21 '22 17:09

JohanL


I'm no expert on numpy, however, I suspect that since float64 is the smallest data type that can fit both the domain of int64 and uint64 that the subtraction converts both operands into a float64 so that the operation always succeeds.

For example, in a with int8 and uint8: +128 - (256) cannot fit in a int8 since -128 is not valid in int8, as it can only fit back to -127. Similarly, we can't use a uint8 since we obviously need the sign in this case. Hence, we settle on a float/double as it can fit both directions fine.

like image 41
mattjegan Avatar answered Sep 24 '22 17:09

mattjegan