Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do a reduction with numpy.nditer in the first axis

Tags:

python

numpy

I'm trying to understand how to work with nditer to do a reduction, in my case converting a 3d array into a 2d array.

I followed the help here http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html and managed to create a function that applies reduction over the last axis of the input. With this function

def nditer_sum(data, red_axes):
    it = numpy.nditer([data, None],
            flags=['reduce_ok', 'external_loop'],
            op_flags=[['readonly'], ['readwrite', 'allocate']],
            op_axes=[None, red_axes])
    it.operands[1][...] = 0

    for x, y in it:
        y[...] = x.sum()

    return it.operands[1]

I can get something equivalent to data.sum(axis=2)

>>> data = numpy.arange(2*3*4).reshape((2,3,4))
>>> nditer_sum(data, [0, 1, -1])
[[ 6 22 38]
[54 70 86]]
>>> data.sum(axis=2)
[[ 6 22 38]
[54 70 86]]

So to get something equivalent to data.sum(axis=0) I though that it was enough to change the argument red_axes to [-1, 0,1] But the result is quite different.

>>> data = numpy.arange(2*3*4).reshape((2,3,4))
>>> data.sum(axis=0)
[[12 14 16 18]
 [20 22 24 26]
 [28 30 32 34]]
>>> nditer_sum(data, [-1, 0, 1])
[[210 210 210 210]
 [210 210 210 210] 
 [210 210 210 210]]

In the for loop inside nditer_sum (for x,y in it:), the iterator is looping 2 times and giving an array of length 12 each time, instead of looping 12 times and giving an array of length 2 each time. I have read the numpy documentation several times and googled about this to no avail. I'm using numpy 1.6 and python 2.7

like image 757
Sergio Avatar asked Sep 30 '12 17:09

Sergio


1 Answers

The axis=0 case works correctly if the nditer order is changed to F. Now there are 12 steps with arrays of size (2,) as you wanted.

it = np.nditer([data, None],
        flags=['reduce_ok', 'external_loop'],
        op_flags=[['readonly'], ['readwrite', 'allocate']],
        order='F',     # ADDED to loop starting with the last dimension
        op_axes=[None, red_axes])

But there isn't a solution like this for middle axis=1 case.


Another approach to iterating on selected dimensions is to construct a 'multi_index' iterator on a reduced dimensional array. I discovered in https://stackoverflow.com/a/25097271/901925 that np.ndindex uses this trick to perform a 'shallow iteration'.

For the axis=0 case, this function works:

def sum_1st(data):
    y = np.zeros(data.shape[1:], data.dtype)
    it = np.nditer(y, flags=['multi_index'])
    while not it.finished:
        xindex = tuple([slice(None)]+list(it.multi_index))
        y[it.multi_index] = data[xindex].sum()
        it.iternext()
    return y

Or generalized to any axis:

def sum_any(data, axis=0):
    yshape = list(data.shape)
    del yshape[axis]
    y = np.zeros(yshape, data.dtype)
    it = np.nditer(y, flags=['multi_index'])
    while not it.finished:
        xindex = list(it.multi_index)
        xindex.insert(axis, slice(None))
        y[it.multi_index] = data[xindex].sum()
        it.iternext()
    return y
like image 70
hpaulj Avatar answered Nov 15 '22 19:11

hpaulj