NumPy is really helpful when creating arrays. If the first argument for numpy.array
has a __getitem__
and __len__
method these are used on the basis that it might be a valid sequence.
Unfortunatly I want to create an array containing dtype=object
without NumPy being "helpful".
Broken down to a minimal example the class would like this:
import numpy as np
class Test(object):
def __init__(self, iterable):
self.data = iterable
def __getitem__(self, idx):
return self.data[idx]
def __len__(self):
return len(self.data)
def __repr__(self):
return '{}({})'.format(self.__class__.__name__, self.data)
and if the "iterables" have different lengths everything is fine and I get exactly the result I want to have:
>>> np.array([Test([1,2,3]), Test([3,2])], dtype=object)
array([Test([1, 2, 3]), Test([3, 2])], dtype=object)
but NumPy creates a multidimensional array if these happen to have the same length:
>>> np.array([Test([1,2,3]), Test([3,2,1])], dtype=object)
array([[1, 2, 3],
[3, 2, 1]], dtype=object)
Unfortunatly there is only a ndmin
argument so I was wondering if there is a way to enforce a ndmax
or somehow prevent NumPy from interpreting the custom classes as another dimension (without deleting __len__
or __getitem__
)?
NumPy is a general-purpose array-processing package. It provides a high-performance multidimensional array object and tools for working with these arrays.
Approach: Import numpy library and create numpy array. Pass the supress value as True to the set_printoptions() method. Print the Array, The entire array will be displayed without scientific notation.
You can use numpy. squeeze() to remove all dimensions of size 1 from the NumPy array ndarray . squeeze() is also provided as a method of ndarray .
In Python, Multidimensional Array can be implemented by fitting in a list function inside another list function, which is basically a nesting operation for the list function. Here, a list can have a number of values of any data type that are segregated by a delimiter like a comma.
This behavior has been discussed a number of times before (e.g. Override a dict with numpy support). np.array
tries to make as high a dimensional array as it can. The model case is nested lists. If it can iterate and the sublists are equal in length it will 'drill' on down.
Here it went down 2 levels before encountering lists of different length:
In [250]: np.array([[[1,2],[3]],[1,2]],dtype=object)
Out[250]:
array([[[1, 2], [3]],
[1, 2]], dtype=object)
In [251]: _.shape
Out[251]: (2, 2)
Without a shape or ndmax parameter it has no way of knowing whether I want it to be (2,)
or (2,2)
. Both of those would work with the dtype.
It's compiled code, so it isn't easy to see exactly what tests it uses. It tries to iterate on lists and tuples, but not on sets or dictionaries.
The surest way to make an object array with a given dimension is to start with an empty one, and fill it
In [266]: A=np.empty((2,3),object)
In [267]: A.fill([[1,'one']])
In [276]: A[:]={1,2}
In [277]: A[:]=[1,2] # broadcast error
Another way is to start with at least one different element (e.g. a None
), and then replace that.
There is a more primitive creator, ndarray
that takes shape:
In [280]: np.ndarray((2,3),dtype=object)
Out[280]:
array([[None, None, None],
[None, None, None]], dtype=object)
But that's basically the same as np.empty
(unless I give it a buffer).
These are fudges, but they aren't expensive (time wise).
================ (edit)
https://github.com/numpy/numpy/issues/5933, Enh: Object array creation function.
is an enhancement request. Also https://github.com/numpy/numpy/issues/5303 the error message for accidentally irregular arrays is confusing
.
The developer sentiment seems to favor a separate function to create dtype=object
arrays, one with more control over the initial dimensions and depth of iteration. They might even strengthen the error checking to keep np.array
from creating 'irregular' arrays.
Such a function could detect the shape of a regular nested iterable down to a specified depth, and build an object type array to be filled.
def objarray(alist, depth=1):
shape=[]; l=alist
for _ in range(depth):
shape.append(len(l))
l = l[0]
arr = np.empty(shape, dtype=object)
arr[:]=alist
return arr
With various depths:
In [528]: alist=[[Test([1,2,3])], [Test([3,2,1])]]
In [529]: objarray(alist,1)
Out[529]: array([[Test([1, 2, 3])], [Test([3, 2, 1])]], dtype=object)
In [530]: objarray(alist,2)
Out[530]:
array([[Test([1, 2, 3])],
[Test([3, 2, 1])]], dtype=object)
In [531]: objarray(alist,3)
Out[531]:
array([[[1, 2, 3]],
[[3, 2, 1]]], dtype=object)
In [532]: objarray(alist,4)
...
TypeError: object of type 'int' has no len()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With