I'm trying to understand how to work with nditer to do a reduction, in my case converting a 3d array into a 2d array.
I followed the help here http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html and managed to create a function that applies reduction over the last axis of the input. With this function
def nditer_sum(data, red_axes):
it = numpy.nditer([data, None],
flags=['reduce_ok', 'external_loop'],
op_flags=[['readonly'], ['readwrite', 'allocate']],
op_axes=[None, red_axes])
it.operands[1][...] = 0
for x, y in it:
y[...] = x.sum()
return it.operands[1]
I can get something equivalent to data.sum(axis=2)
>>> data = numpy.arange(2*3*4).reshape((2,3,4))
>>> nditer_sum(data, [0, 1, -1])
[[ 6 22 38]
[54 70 86]]
>>> data.sum(axis=2)
[[ 6 22 38]
[54 70 86]]
So to get something equivalent to data.sum(axis=0) I though that it was enough to change the argument red_axes to [-1, 0,1] But the result is quite different.
>>> data = numpy.arange(2*3*4).reshape((2,3,4))
>>> data.sum(axis=0)
[[12 14 16 18]
[20 22 24 26]
[28 30 32 34]]
>>> nditer_sum(data, [-1, 0, 1])
[[210 210 210 210]
[210 210 210 210]
[210 210 210 210]]
In the for loop inside nditer_sum (for x,y in it:), the iterator is looping 2 times and giving an array of length 12 each time, instead of looping 12 times and giving an array of length 2 each time. I have read the numpy documentation several times and googled about this to no avail. I'm using numpy 1.6 and python 2.7
The axis=0
case works correctly if the nditer
order is changed to F
. Now there are 12 steps with arrays of size (2,) as you wanted.
it = np.nditer([data, None],
flags=['reduce_ok', 'external_loop'],
op_flags=[['readonly'], ['readwrite', 'allocate']],
order='F', # ADDED to loop starting with the last dimension
op_axes=[None, red_axes])
But there isn't a solution like this for middle axis=1
case.
Another approach to iterating on selected dimensions is to construct a 'multi_index' iterator on a reduced dimensional array. I discovered in https://stackoverflow.com/a/25097271/901925 that np.ndindex
uses this trick to perform a 'shallow iteration'.
For the axis=0
case, this function works:
def sum_1st(data):
y = np.zeros(data.shape[1:], data.dtype)
it = np.nditer(y, flags=['multi_index'])
while not it.finished:
xindex = tuple([slice(None)]+list(it.multi_index))
y[it.multi_index] = data[xindex].sum()
it.iternext()
return y
Or generalized to any axis:
def sum_any(data, axis=0):
yshape = list(data.shape)
del yshape[axis]
y = np.zeros(yshape, data.dtype)
it = np.nditer(y, flags=['multi_index'])
while not it.finished:
xindex = list(it.multi_index)
xindex.insert(axis, slice(None))
y[it.multi_index] = data[xindex].sum()
it.iternext()
return y
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With