numpy vectorization instead of for loops

Question

I wrote some code in Python which works fine but is very slow; I think due to the for loops. I hope one can speed up the following operations using numpy commands. Let me define the goal.

Let's assume I have a 2D numpy array all_CMs of dimensions row x col. For instance consider a 6x11 array (see drawing below).

I want to calculate the mean for all rows, i.e. sumⱼ aᵢⱼ resulting in an array. This, of course can be easily done. (I call this value CM_tilde)
Now, for each row I want to calculate the mean of some selected values, namely all values below a certain threshold by computing their sum and dividing it by the number of all columns (N). If the value is above this defined threshold, the CM_tilde value (mean of the entire row) is added. This value is called CM
Afterwards, the CM value is subtracted from each element in the row

In addition to this I want to have a numpy array or list where all those CM values are listed.

The figure:

The following code is working but very slow (especially if the arrays getting large)

CM_tilde = np.mean(data, axis=1)
N = data.shape[1]
data_cm = np.zeros(( data.shape[0], data.shape[1], data.shape[2] ))
all_CMs = np.zeros(( data.shape[0], data.shape[2]))
for frame in range(data.shape[2]):
    for row in range(data.shape[0]):
        CM=0
        for col in range(data.shape[1]):
            if data[row, col, frame] < (CM_tilde[row, frame]+threshold):
               CM += data[row, col, frame]
            else:
               CM += CM_tilde[row, frame]
        CM = CM/N
        all_CMs[row, frame] = CM
        # calculate CM corrected value
        for col in range(data.shape[1]):
            data_cm[row, col, frame] = data[row, col, frame] - CM
    print "frame: ", frame
return data_cm, all_CMs

Any ideas?

Andras Deak -- Слава Україні · Accepted Answer

It's quite easy to vectorize what you're doing:

import numpy as np

#generate dummy data
nrows=6
ncols=11
nframes=3
threshold=0.3
data=np.random.rand(nrows,ncols,nframes)

CM_tilde = np.mean(data, axis=1)
N = data.shape[1]

all_CMs2 = np.mean(np.where(data < (CM_tilde[:,None,:]+threshold),data,CM_tilde[:,None,:]),axis=1)
data_cm2 = data - all_CMs2[:,None,:]

Comparing this with your originals:

In [684]: (data_cm==data_cm2).all()
Out[684]: True

In [685]: (all_CMs==all_CMs2).all()
Out[685]: True

The logic is that we work with arrays of size [nrows,ncols,nframes] simultaneously. The main trick is to make use of python's broadcasting, by turning CM_tilde of size [nrows,nframes] into CM_tilde[:,None,:] of size [nrows,1,nframes]. Python will then use the same values for each column, since that is a singleton dimension of this modified CM_tilde.

By using np.where we choose (based on the threshold) whether we want to get the corresponding value of data, or, again, the broadcast value of CM_tilde. A new use of np.mean allows us to compute all_CMs2.

In the final step we made use of broadcasting by directly subtracting this new all_CMs2 from the corresponding elements of data.

It might help in vectorizing code this way by looking at the implicit indices of your temporary variables. What I mean is that your temporary variable CM lives inside a loop over [nrows,nframes], and its value is reset with each iteration. This means that CM is in effect a quantity CM[row,frame] (later explicitly assigned to the 2d array all_CMs), and from here it's easy to see that you can construct it by summing up an appropriate CMtmp[row,col,frames] quantity along its column dimension. If it helps, you can name the np.where(...) part as CMtmp for this purpose, and then compute np.mean(CMtmp,axis=1) from that. Same result, obviously, but probably more transparent.

hpaulj · Answer

Here's my vectorization of your function. I worked from the inside out, and commented out earlier versions as I went along. So the first loop that I vectorized has ### comment marks.

It isn't as clean and well reasoned as @Andras's answer, but hopefully it is instructional, giving an idea of how you can address this issue incrementally.

def foo2(data, threshold):
    CM_tilde = np.mean(data, axis=1)
    N = data.shape[1]
    #data_cm = np.zeros(( data.shape[0], data.shape[1], data.shape[2] ))
    ##all_CMs = np.zeros(( data.shape[0], data.shape[2]))
    bmask = data < (CM_tilde[:,None,:] + threshold)
    CM = np.zeros_like(data)
    CM[:] = CM_tilde[:,None,:]
    CM[bmask] = data[bmask]
    CM = CM.sum(axis=1)
    CM = CM/N
    all_CMs = CM.copy()
    """
    for frame in range(data.shape[2]):
        for row in range(data.shape[0]):
            ###print(frame, row)
            ###mask = data[row, :, frame] < (CM_tilde[row, frame]+threshold)
            ###print(mask)
            ##mask = bmask[row,:,frame]
            ##CM = data[row, mask, frame].sum()
            ##CM += (CM_tilde[row, frame]*(~mask)).sum()

            ##CM = CM/N
            ##all_CMs[row, frame] = CM
            ## calculate CM corrected value
            #for col in range(data.shape[1]):
            #    data_cm[row, col, frame] = data[row, col, frame] - CM[row,frame]
        print "frame: ", frame
    """
    data_cm = data - CM[:,None,:]
    return data_cm, all_CMs

Output matches for this small test case, which more than anything helped me get the dimensions right.

threshold = .1
data = np.arange(4*3*2,dtype=float).reshape(4,3,2)

numpy vectorization instead of for loops

Tags:

python

vectorization

numpy

pallago

2 Answers

Andras Deak -- Слава Україні

hpaulj

Recent Activity

Donate For Us

numpy vectorization instead of for loops

Tags:

python

vectorization

numpy

pallago

2 Answers

Andras Deak -- Слава Україні

hpaulj

Related questions

Recent Activity

Donate For Us