In numpy, why does subtraction of integers sometimes produce floating point numbers? <pre class="prettyprint"><code>>>> x = np.int64(2) - np.uint64(1) >>> x 1.0 >>> x.dtype dtype('float64') </code></pre> This seems to only occur when using multiple different integer types (e.g. signed and unsigned), and when no larger integer type is available.

This is a conscious design decision by the <code>numpy</code> authors. When deciding on the resulting type, only the types of the operands are considered, not their actual values. And for the operation you perform, there is a risk of having a result outside the valid range, e.g. if you subtract a very large <code>uint64</code> number, the result would not fit in an <code>int64</code>. The safe selection is thus to convert to <code>float64</code>, which certainly will fit the result (possibly with reduced precision, though). Compare with an example of <code>x = np.int32(2) - np.uint32(1)</code>. This can always be safely represented as an <code>int64</code>, therefore that type is chosen. The same would be true for <code>x = np.int64(2) - np.uint32(1)</code>. This will also yield an <code>int64</code>. The alternative would be to follow e.g. the c rules, which would cast everything to <code>uint64</code>. But that could, of course, lead to very strange results with over/underflows. If you want to know ahead of time what type you will end up with, look into <code>np.result_type()</code>, <code>np.can_cast()</code>, and <code>np.promote_types()</code>. Reading about this in the docs might also help you understand the issue a bit better.

I'm no expert on numpy, however, I suspect that since <code>float64</code> is the smallest data type that can fit both the domain of <code>int64</code> and <code>uint64</code> that the subtraction converts both operands into a <code>float64</code> so that the operation always succeeds. For example, in a with <code>int8</code> and <code>uint8</code>: <code>+128 - (256)</code> cannot fit in a <code>int8</code> since <code>-128</code> is not valid in <code>int8</code>, as it can only fit back to <code>-127</code>. Similarly, we can't use a <code>uint8</code> since we obviously need the sign in this case. Hence, we settle on a float/double as it can fit both directions fine.

Why does numpy integer subtraction produce a float64?

Tags:

python

type-conversion

numpy

In numpy, why does subtraction of integers sometimes produce floating point numbers?

>>> x = np.int64(2) - np.uint64(1)
>>> x
1.0
>>> x.dtype
dtype('float64')

This seems to only occur when using multiple different integer types (e.g. signed and unsigned), and when no larger integer type is available.

319

asked Jun 30 '17 05:06

benjimin

2 Answers

This is a conscious design decision by the numpy authors. When deciding on the resulting type, only the types of the operands are considered, not their actual values. And for the operation you perform, there is a risk of having a result outside the valid range, e.g. if you subtract a very large uint64 number, the result would not fit in an int64. The safe selection is thus to convert to float64, which certainly will fit the result (possibly with reduced precision, though).

Compare with an example of x = np.int32(2) - np.uint32(1). This can always be safely represented as an int64, therefore that type is chosen. The same would be true for x = np.int64(2) - np.uint32(1). This will also yield an int64.

The alternative would be to follow e.g. the c rules, which would cast everything to uint64. But that could, of course, lead to very strange results with over/underflows.

If you want to know ahead of time what type you will end up with, look into np.result_type(), np.can_cast(), and np.promote_types(). Reading about this in the docs might also help you understand the issue a bit better.

answered Sep 21 '22 17:09

JohanL

I'm no expert on numpy, however, I suspect that since float64 is the smallest data type that can fit both the domain of int64 and uint64 that the subtraction converts both operands into a float64 so that the operation always succeeds.

For example, in a with int8 and uint8: +128 - (256) cannot fit in a int8 since -128 is not valid in int8, as it can only fit back to -127. Similarly, we can't use a uint8 since we obviously need the sign in this case. Hence, we settle on a float/double as it can fit both directions fine.

answered Sep 24 '22 17:09

mattjegan

Related questions
                            
                                How to run code after Flask send_file() or send_from_directory()
                            
                                Renaming downloaded images in Scrapy 0.24 with content from an item field while avoiding filename conflicts?
                            
                                How to save Python NLTK alignment models for later use?
                            
                                Using coverage, how do I test this line?
                            
                                Errno 2 using python shutil.py No such file or directory for file destination
                            
                                Increasing speed of a pure Numpy/Scipy convolutional neural network implementation
                            
                                Python futurize without replacing / with old_div
                            
                                where is the ./configure of TensorFlow and how to enable the GPU support?
                            
                                What does "dict-like" mean in Python?
                            
                                csv: writer.writerows() splitting my string inputs
                            
                                Should variable names have adjectives before or after the noun? [closed]
                            
                                Generating random vectors of Euclidean norm <= 1 in Python?
                            
                                Tox installs the wrong version of pip to it's virtual env
                            
                                Pandas setting multi-index on rows, then transposing to columns
                            
                                Why does Python's set difference method take time with an empty set?
                            
                                Python Data Frame: cumulative sum of column until condition is reached and return the index
                            
                                Automatic scroll down to bottom of result in ipython notebook
                            
                                Python 3.x cannot serialize Decimal() to JSON
                            
                                'No module named requests' even if I installed requests with pip
                            
                                Is there a way to set title/name of a thread in Python? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With