Python numpy ndarrays are failing me! Can I go back to Matlab??
Let's say I have a function that is expecting a ndarray vector input. I use the numpy.asarray function to force the inputs into the form I want, conveniently with no duplication for things that are already ndarrays. However, if a scalar gets passed in, it is sometimes made into a 0d array instead of a 1d array, depending on exactly how it got passed in. The 0d array causes issues, because I can index into it.
First off, why can't I? Say x = np.array(1)
. Then x.size == 1
, so it should have a 0th element. Why can't I do x[0]
or x[-1]
. I get that it wants to be like a python int, but it should be improved over an int, and not purposely given the same limitations.
Secondly, it would be awesome if the numpy asarray function had some optional input to force the output to always be at least a 1d array. Then I could do something like x = np.asarray(x, force_at_least_1d=True)
.
However, the best option I could come up with is to check the ndim property, and if it's 0, then expand it to 1. This just feels wrong to me. Is there some other option that I'm missing?
import numpy as np
def func(x, extra_check=True):
r"""Meaningless Example Function for stackoverflow."""
# force input to be ndarrays
x = np.asarray(x)
if x.size == 0:
print('Don''t do anything.')
# Extra test to deal with garbage 0D arrays so that they can be indexed by keep.
# This test is really bothering me. Is there a better way to make it unnecessary?
if extra_check and (x.ndim == 0):
x = x[np.newaxis]
if x[0] > 0 and x[-1] > 5:
print('Do something cool.')
else:
print('Do something less cool.')
if __name__ == '__main__':
# nominally intended use
x1 = np.array([1, 2, 10])
y1 = func(x1) # prints "Do something cool."
# single item use
x2 = x1[x1 == 2]
y2 = func(x2) # prints "Do something less cool."
# scalar single item use that works with extra check
x3 = x1[1]
y3 = func(x3) # prints "Do something less cool."
# scalar single item that will fail without extra check
x4 = x1[1]
y4 = func(x4, extra_check=False) # raises IndexError
So my main question here is whether there's a better way than what I have. And If not, do others agree that there should be? I'm relatively new to Python, so I've never tried to contribute anything to the source yet, but presumably I can look for another question that explains that process to me.
In case it matters, I'm on python v3.5.1 and numpy 1.9.3. Thanks!
Secondly, it would be awesome if the numpy asarray function had some optional input to force the output to always be at least a 1d array. Then I could do something like x = np.asarray(x, force_at_least_1d=True).
np.asarray
doesn't, but np.array
does --ndmin
-- and there's a dedicated np.atleast_1d
function (also 2 and 3):
>>> np.array(0, ndmin=1)
array([0])
>>> np.atleast_1d(np.array(0))
array([0])
Any array can be indexed with a tuple with x.ndim
elements
2d:
In [238]: x=np.array([[1]])
In [239]: x.ndim
Out[239]: 2
In [240]: x[(0,0)] # same as x[0,0]
Out[240]: 1
1d:
In [241]: x=np.array([1])
In [242]: x[(0,)] # (0,) to distinguish from (0)==0
Out[242]: 1
0d:
In [243]: x=np.array(1)
In [244]: x[()] # empty tuple
Out[244]: 1
Indexing an element doesn't actually return a scalar
In [250]: x=np.array([[1]])
In [251]: type(x[0,0])
Out[251]: numpy.int32
In [252]: x[0,0][()]
Out[252]: array(1)
It returns a dtype
object, which accepts 0d indexing.
You mention MATLAB. - there everything is 2d (or higher); Isn't it more logical to set 0d as the lower bound? :)
The other answer mentioned the ndmin
parameter, as well as atleast_1d
(there's also a 2d and 3d). Look at the docs and code of atleast_1d
to see how it reshapes various cases. e.g.
if len(ary.shape) == 0 :
result = ary.reshape(1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With