Normally the <code>dtype</code> is hidden when it's equivalent to the native type: <pre class="prettyprint"><code>>>> import numpy as np >>> np.arange(5) array([0, 1, 2, 3, 4]) >>> np.arange(5).dtype dtype('int32') >>> np.arange(5) + 3 array([3, 4, 5, 6, 7]) </code></pre> But somehow that doesn't apply to floor division or modulo: <pre class="prettyprint"><code>>>> np.arange(5) // 3 array([0, 0, 0, 1, 1], dtype=int32) >>> np.arange(5) % 3 array([0, 1, 2, 0, 1], dtype=int32) </code></pre> Why is there a difference? Python 3.5.4, NumPy 1.13.1, Windows 64bit

It comes down to a difference in the <code>dtype</code>, as can be seen from the <code>view</code>: <pre class="prettyprint"><code>In [186]: x = np.arange(10) In [187]: y = x // 3 In [188]: x Out[188]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [189]: y Out[189]: array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3], dtype=int32) In [190]: x.view(y.dtype) Out[190]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32) In [191]: y.view(x.dtype) Out[191]: array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3]) </code></pre> Even though the <code>dtype</code> <code>descr</code> are the same, there's some attribute that's different. But which? <pre class="prettyprint"><code>In [192]: x.dtype.descr Out[192]: [('', '<i4')] In [193]: y.dtype.descr Out[193]: [('', '<i4')] In [204]: x.dtype.type Out[204]: numpy.int32 In [205]: y.dtype.type Out[205]: numpy.int32 In [207]: dtx.type is dty.type Out[207]: False In [243]: np.core.numeric._typelessdata Out[243]: [numpy.int32, numpy.float64, numpy.complex128] In [245]: x.dtype.type in np.core.numeric._typelessdata Out[245]: True In [246]: y.dtype.type in np.core.numeric._typelessdata Out[246]: False </code></pre> So <code>y</code>s <code>dtype.type</code> by all appearances is the same as <code>x</code>s, but it's a different object, with a different <code>id</code>: <pre class="prettyprint"><code>In [261]: id(np.int32) Out[261]: 3045777728 In [262]: id(x.dtype.type) Out[262]: 3045777728 In [263]: id(y.dtype.type) Out[263]: 3045777952 In [282]: id(np.intc) Out[282]: 3045777952 </code></pre> Add this extra <code>type</code> to the list, and <code>y</code> no longer shows the dtype: <pre class="prettyprint"><code>In [267]: np.core.numeric._typelessdata.append(y.dtype.type) In [269]: y Out[269]: array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3]) </code></pre> So <code>y.dtype.type</code> is <code>np.intc</code> (and <code>np.intp</code>), while <code>x.dtype.type</code> is <code>np.int32</code> (and <code>np.int_</code>). So to make an array that displays the dtype, use <code>np.intc</code>. <pre class="prettyprint"><code>In [23]: np.arange(10,dtype=np.int_) Out[23]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [24]: np.arange(10,dtype=np.intc) Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32) </code></pre> And to turn this off, append <code>np.intc</code> to <code>np.core.numeric._typelessdata</code>.

Why is the dtype shown (even if it's the native one) when using floor division with NumPy?

Tags:

python

arrays

division

numpy

numpy-dtype

Normally the dtype is hidden when it's equivalent to the native type:

>>> import numpy as np
>>> np.arange(5)
array([0, 1, 2, 3, 4])
>>> np.arange(5).dtype
dtype('int32')

>>> np.arange(5) + 3
array([3, 4, 5, 6, 7])

But somehow that doesn't apply to floor division or modulo:

>>> np.arange(5) // 3
array([0, 0, 0, 1, 1], dtype=int32)
>>> np.arange(5) % 3
array([0, 1, 2, 0, 1], dtype=int32)

Why is there a difference?

Python 3.5.4, NumPy 1.13.1, Windows 64bit

621

asked Sep 18 '17 17:09

MSeifert

2 Answers

It comes down to a difference in the dtype, as can be seen from the view:

In [186]: x = np.arange(10)
In [187]: y = x // 3
In [188]: x
Out[188]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [189]: y
Out[189]: array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3], dtype=int32)
In [190]: x.view(y.dtype)
Out[190]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)
In [191]: y.view(x.dtype)
Out[191]: array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3])

Even though the dtype descr are the same, there's some attribute that's different. But which?

In [192]: x.dtype.descr
Out[192]: [('', '<i4')]
In [193]: y.dtype.descr
Out[193]: [('', '<i4')]

In [204]: x.dtype.type
Out[204]: numpy.int32
In [205]: y.dtype.type
Out[205]: numpy.int32
In [207]: dtx.type is dty.type
Out[207]: False

In [243]: np.core.numeric._typelessdata
Out[243]: [numpy.int32, numpy.float64, numpy.complex128]
In [245]: x.dtype.type in np.core.numeric._typelessdata
Out[245]: True
In [246]: y.dtype.type in np.core.numeric._typelessdata
Out[246]: False

So ys dtype.type by all appearances is the same as xs, but it's a different object, with a different id:

In [261]: id(np.int32)
Out[261]: 3045777728
In [262]: id(x.dtype.type)
Out[262]: 3045777728
In [263]: id(y.dtype.type)
Out[263]: 3045777952
In [282]: id(np.intc)
Out[282]: 3045777952

Add this extra type to the list, and y no longer shows the dtype:

In [267]: np.core.numeric._typelessdata.append(y.dtype.type)
In [269]: y
Out[269]: array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3])

So y.dtype.type is np.intc (and np.intp), while x.dtype.type is np.int32 (and np.int_).

So to make an array that displays the dtype, use np.intc.

In [23]: np.arange(10,dtype=np.int_)
Out[23]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [24]: np.arange(10,dtype=np.intc)
Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)

And to turn this off, append np.intc to np.core.numeric._typelessdata.

answered Oct 12 '22 14:10

hpaulj

You actually have multiple distinct 32-bit integer dtypes here. This is probably a bug.

NumPy has (accidentally?) created multiple distinct signed 32-bit integer types, probably corresponding to C int and long. Both of them display as numpy.int32, but they're actually different objects. At C level, I believe the type objects are PyIntArrType_Type and PyLongArrType_Type, generated here.

dtype objects have a type attribute corresponding to the type object of scalars of that dtype. It is this type attribute that NumPy inspects when deciding whether to print dtype information in an array's repr:

_typelessdata = [int_, float_, complex_]
if issubclass(intc, int):
    _typelessdata.append(intc)


if issubclass(longlong, int):
    _typelessdata.append(longlong)

...

def array_repr(arr, max_line_width=None, precision=None, suppress_small=None):
    ...
    skipdtype = (arr.dtype.type in _typelessdata) and arr.size > 0

    if skipdtype:
        return "%s(%s)" % (class_name, lst)
    else:
        ...
        return "%s(%s,%sdtype=%s)" % (class_name, lst, lf, typename)

On numpy.arange(5) and numpy.arange(5) + 3, .dtype.type is numpy.int_; on numpy.arange(5) // 3 or numpy.arange(5) % 3, .dtype.type is the other 32-bit signed integer type.

As for why + and // have different output dtypes, they use different type resolution routines. Here's the one for //, and here's the one for +. //'s type resolution looks for a ufunc inner loop that takes types the inputs can be safely cast to, while +'s type resolution applies NumPy type promotion to the arguments and picks the loop matching the resulting type.

106

answered Oct 12 '22 13:10

user2357112 supports Monica

Related questions
                            
                                Load pandas dataframe with chunksize determined by column variable
                            
                                OpenCV determine area of intersect/overlap
                            
                                Flask: testing with unittest - how to get .post json from response
                            
                                Timeseries plot from CSV data (Timestamp and events): x-label constant
                            
                                Convert a list of numbers to ranges
                            
                                range non-default parameter follows default one
                            
                                combine different seaborn facet grids into single plot
                            
                                Django admin unregister Sites
                            
                                Python/R: generate dataframe from XML when not all nodes contain all variables?
                            
                                What's the best way to downsample a numpy array?
                            
                                Python multiprocessing: How to close the multiprocessing pool on exception
                            
                                Find indexes of repeated elements in an array (Python, NumPy)
                            
                                Output audio file not created correctly, or has unknown duration time
                            
                                Python convert True False matrix to image
                            
                                Fastest way to find non-finite values
                            
                                How can dynamically create permission in django?
                            
                                Check if NaN in Tensorflow
                            
                                Error using sklearn and linear regression: shapes (1,16) and (1,1) not aligned: 16 (dim 1) != 1 (dim 0)
                            
                                What is a module variable vs. a global variable?
                            
                                The order of axis when printing a NumPy array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With