This isn't as much of a problem as a curiosity. In my interpreter on 64 bit linux I can execute <pre class="prettyprint"><code>In [10]: np.int64 == np.int64 Out[10]: True In [11]: np.int64 is np.int64 Out[11]: True </code></pre> Great, just what I would expect. However I found this weird property of the numpy.core.numeric module <pre class="prettyprint"><code>In [19]: from numpy.core.numeric import _typelessdata In [20]: _typelessdata Out[20]: [numpy.int64, numpy.float64, numpy.complex128, numpy.int64] </code></pre> Weird why is numpy.int64 in there twice? Lets investigate. <pre class="prettyprint"><code>In [23]: _typelessdata[0] is _typelessdata[-1] Out[23]: False In [24]: _typelessdata[0] == _typelessdata[-1] Out[24]: False In [25]: id(_typelessdata[-1]) Out[25]: 139990931572128 In [26]: id(_typelessdata[0]) Out[26]: 139990931572544 In [27]: _typelessdata[-1] Out[27]: numpy.int64 In [28]: _typelessdata[0] Out[28]: numpy.int64 </code></pre> Whoah they are different. What is going on here? Why are there two np.int64's?

Here are the lines where <code>_typelessdata</code> is constructed within <code>numeric.py</code>: <pre class="prettyprint"><code>_typelessdata = [int_, float_, complex_] if issubclass(intc, int): _typelessdata.append(intc) if issubclass(longlong, int): _typelessdata.append(longlong) </code></pre> <code>intc</code> is a C-compatible (32bit) signed integer, and <code>int</code> is a native Python integer, which may be either 32bit or 64bit depending on the platform. <ul> <li> On a 32bit system the native Python <code>int</code> type is also 32bit, so <code>issubclass(intc, int)</code> returns <code>True</code> and <code>intc</code> gets appended to <code>_typelessdata</code>, which ends up looking like this: <pre class="prettyprint"><code>[numpy.int32, numpy.float64, numpy.complex128, numpy.int32] </code></pre> Note that <code>_typelessdata[-1] is numpy.intc</code>, not <code>numpy.int32</code>. </li> <li> On a 64bit system, <code>int</code> is 64bit, and therefore <code>issubclass(longlong, int)</code> returns <code>True</code> and a <code>longlong</code> gets appended to <code>_typelessdata</code>, resulting in: <pre class="prettyprint"><code>[numpy.int64, numpy.float64, numpy.complex128, numpy.int64] </code></pre> In this case, as Joe pointed out, <code>(_typelessdata[-1] is numpy.longlong) == True</code>. </li> </ul> <hr> The bigger question is why the contents of <code>_typelessdata</code> are set like this. The only place I could find in the numpy source where <code>_typelessdata</code> is actually used is this line within the definition for <code>np.array_repr</code> in the same file: <pre class="prettyprint"><code>skipdtype = (arr.dtype.type in _typelessdata) and arr.size > 0 </code></pre> The purpose of <code>_typelessdata</code> is to ensure that <code>np.array_repr</code> correctly prints the string representation of arrays whose <code>dtype</code> happens to be the same as the (platform-dependent) native Python integer type. For example, on a 32bit system, where <code>int</code> is 32bit: <pre class="prettyprint"><code>In [1]: np.array_repr(np.intc([1])) Out[1]: 'array([1])' In [2]: np.array_repr(np.longlong([1])) Out[2]: 'array([1], dtype=int64)' </code></pre> whereas on a 64bit system, where <code>int</code> is 64bit: <pre class="prettyprint"><code>In [1]: np.array_repr(np.intc([1])) Out[1]: 'array([1], dtype=int32)' In [2]: np.array_repr(np.longlong([1])) Out[2]: 'array([1])' </code></pre> The <code>arr.dtype.type in _typelessdata</code> check in the line above ensures that printing the <code>dtype</code> is skipped for the appropriate platform-dependent native integer <code>dtypes</code>.

I don't know the full history behind it, but the second <code>int64</code> is actually <code>numpy.longlong</code>. <pre class="prettyprint"><code>In [1]: import numpy as np In [2]: from numpy.core.numeric import _typelessdata In [3]: _typelessdata Out[4]: [numpy.int64, numpy.float64, numpy.complex128, numpy.int64] In [5]: id(_typelessdata[-1]) == id(np.longlong) Out[5]: True </code></pre> <code>numpy.longlong</code> is supposed to directly correspond to C's <code>long long</code> type. C's <code>long long</code> is specified to be at least 64 bits wide, but the exact definition is left up to the compiler. My guess is that <code>numpy.longlong</code> winds up being another instance of <code>numpy.int64</code> on most systems, but is allowed to be something different if the C complier defines <code>long long</code> as something wider than 64 bits.

Why are there two np.int64s in numpy.core.numeric._typelessdata (Why is numpy.int64 not numpy.int64?)

Tags:

python

numpy

This isn't as much of a problem as a curiosity.

In my interpreter on 64 bit linux I can execute

In [10]: np.int64 == np.int64
Out[10]: True

In [11]: np.int64 is np.int64
Out[11]: True

Great, just what I would expect. However I found this weird property of the numpy.core.numeric module

In [19]: from numpy.core.numeric import _typelessdata

In [20]: _typelessdata
Out[20]: [numpy.int64, numpy.float64, numpy.complex128, numpy.int64]

Weird why is numpy.int64 in there twice? Lets investigate.

In [23]: _typelessdata[0] is _typelessdata[-1]
Out[23]: False
In [24]: _typelessdata[0] == _typelessdata[-1]
Out[24]: False
In [25]: id(_typelessdata[-1])
Out[25]: 139990931572128
In [26]: id(_typelessdata[0])
Out[26]: 139990931572544
In [27]: _typelessdata[-1]
Out[27]: numpy.int64
In [28]: _typelessdata[0]
Out[28]: numpy.int64

Whoah they are different. What is going on here? Why are there two np.int64's?

488

asked Feb 11 '15 13:02

Erotemic

2 Answers

Here are the lines where _typelessdata is constructed within numeric.py:

_typelessdata = [int_, float_, complex_]
if issubclass(intc, int):
    _typelessdata.append(intc)

if issubclass(longlong, int):
    _typelessdata.append(longlong)

intc is a C-compatible (32bit) signed integer, and int is a native Python integer, which may be either 32bit or 64bit depending on the platform.

On a 32bit system the native Python int type is also 32bit, so issubclass(intc, int) returns True and intc gets appended to _typelessdata, which ends up looking like this:
```
[numpy.int32, numpy.float64, numpy.complex128, numpy.int32]
```
Note that _typelessdata[-1] is numpy.intc, not numpy.int32.
On a 64bit system, int is 64bit, and therefore issubclass(longlong, int) returns True and a longlong gets appended to _typelessdata, resulting in:
```
[numpy.int64, numpy.float64, numpy.complex128, numpy.int64]
```
In this case, as Joe pointed out, (_typelessdata[-1] is numpy.longlong) == True.

The bigger question is why the contents of _typelessdata are set like this. The only place I could find in the numpy source where _typelessdata is actually used is this line within the definition for np.array_repr in the same file:

skipdtype = (arr.dtype.type in _typelessdata) and arr.size > 0

The purpose of _typelessdata is to ensure that np.array_repr correctly prints the string representation of arrays whose dtype happens to be the same as the (platform-dependent) native Python integer type.

For example, on a 32bit system, where int is 32bit:

In [1]: np.array_repr(np.intc([1]))
Out[1]: 'array([1])'

In [2]: np.array_repr(np.longlong([1]))
Out[2]: 'array([1], dtype=int64)'

whereas on a 64bit system, where int is 64bit:

In [1]: np.array_repr(np.intc([1]))
Out[1]: 'array([1], dtype=int32)'

In [2]: np.array_repr(np.longlong([1]))
Out[2]: 'array([1])'

The arr.dtype.type in _typelessdata check in the line above ensures that printing the dtype is skipped for the appropriate platform-dependent native integer dtypes.

200

answered Oct 23 '22 23:10

ali_m

I don't know the full history behind it, but the second int64 is actually numpy.longlong.

In [1]: import numpy as np

In [2]: from numpy.core.numeric import _typelessdata

In [3]: _typelessdata
Out[4]: [numpy.int64, numpy.float64, numpy.complex128, numpy.int64]

In [5]: id(_typelessdata[-1]) == id(np.longlong)
Out[5]: True

numpy.longlong is supposed to directly correspond to C's long long type. C's long long is specified to be at least 64 bits wide, but the exact definition is left up to the compiler.

My guess is that numpy.longlong winds up being another instance of numpy.int64 on most systems, but is allowed to be something different if the C complier defines long long as something wider than 64 bits.

answered Oct 24 '22 00:10

Joe Kington

Related questions
                            
                                Fibonacci Rabbits Dying After Arbitrary # of Months
                            
                                Converting a python script to a web application [closed]
                            
                                What is uWSGI master mode?
                            
                                How to show source code of a package function in IPython notebook
                            
                                How to create a function at runtime with specified argument names?
                            
                                Fast detection or simulation of WSAECONNREFUSED
                            
                                Pycharm set the correct environment variable PATH
                            
                                Python easy_install in a virtualenv gives setuptools error
                            
                                Why do I get a "404 Not Found" error even though the link is on the server?
                            
                                How to install python-devel when using virtualenv
                            
                                Storing pandas DataFrames in SQLAlchemy models
                            
                                Getting "If-Match or If-None-Match header or entry etag attribute required" errors when batch deleting contacts
                            
                                Is there any analogue of EXIT_SUCCESS and EXIT_FAILURE macros in Python 2.7.6
                            
                                missing column after pandas groupby
                            
                                Python bcolz how to merge two ctables
                            
                                Increase C++ regex replace performance
                            
                                pandas and numpy thread safety
                            
                                How to compare two dates in Jinja2?
                            
                                Extracting stderr from pexpect
                            
                                python27 matplotlib: first and last element connected

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With