Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy vectorization instead of for loops

I wrote some code in Python which works fine but is very slow; I think due to the for loops. I hope one can speed up the following operations using numpy commands. Let me define the goal.

Let's assume I have a 2D numpy array all_CMs of dimensions row x col. For instance consider a 6x11 array (see drawing below).

  1. I want to calculate the mean for all rows, i.e. sumⱼ aᵢⱼ resulting in an array. This, of course can be easily done. (I call this value CM_tilde)

  2. Now, for each row I want to calculate the mean of some selected values, namely all values below a certain threshold by computing their sum and dividing it by the number of all columns (N). If the value is above this defined threshold, the CM_tilde value (mean of the entire row) is added. This value is called CM

  3. Afterwards, the CM value is subtracted from each element in the row

In addition to this I want to have a numpy array or list where all those CM values are listed.

The figure:

figure

The following code is working but very slow (especially if the arrays getting large)

CM_tilde = np.mean(data, axis=1)
N = data.shape[1]
data_cm = np.zeros(( data.shape[0], data.shape[1], data.shape[2] ))
all_CMs = np.zeros(( data.shape[0], data.shape[2]))
for frame in range(data.shape[2]):
    for row in range(data.shape[0]):
        CM=0
        for col in range(data.shape[1]):
            if data[row, col, frame] < (CM_tilde[row, frame]+threshold):
               CM += data[row, col, frame]
            else:
               CM += CM_tilde[row, frame]
        CM = CM/N
        all_CMs[row, frame] = CM
        # calculate CM corrected value
        for col in range(data.shape[1]):
            data_cm[row, col, frame] = data[row, col, frame] - CM
    print "frame: ", frame
return data_cm, all_CMs

Any ideas?

like image 618
pallago Avatar asked Nov 18 '15 22:11

pallago


2 Answers

It's quite easy to vectorize what you're doing:

import numpy as np

#generate dummy data
nrows=6
ncols=11
nframes=3
threshold=0.3
data=np.random.rand(nrows,ncols,nframes)

CM_tilde = np.mean(data, axis=1)
N = data.shape[1]

all_CMs2 = np.mean(np.where(data < (CM_tilde[:,None,:]+threshold),data,CM_tilde[:,None,:]),axis=1)
data_cm2 = data - all_CMs2[:,None,:]

Comparing this with your originals:

In [684]: (data_cm==data_cm2).all()
Out[684]: True

In [685]: (all_CMs==all_CMs2).all()
Out[685]: True

The logic is that we work with arrays of size [nrows,ncols,nframes] simultaneously. The main trick is to make use of python's broadcasting, by turning CM_tilde of size [nrows,nframes] into CM_tilde[:,None,:] of size [nrows,1,nframes]. Python will then use the same values for each column, since that is a singleton dimension of this modified CM_tilde.

By using np.where we choose (based on the threshold) whether we want to get the corresponding value of data, or, again, the broadcast value of CM_tilde. A new use of np.mean allows us to compute all_CMs2.

In the final step we made use of broadcasting by directly subtracting this new all_CMs2 from the corresponding elements of data.

It might help in vectorizing code this way by looking at the implicit indices of your temporary variables. What I mean is that your temporary variable CM lives inside a loop over [nrows,nframes], and its value is reset with each iteration. This means that CM is in effect a quantity CM[row,frame] (later explicitly assigned to the 2d array all_CMs), and from here it's easy to see that you can construct it by summing up an appropriate CMtmp[row,col,frames] quantity along its column dimension. If it helps, you can name the np.where(...) part as CMtmp for this purpose, and then compute np.mean(CMtmp,axis=1) from that. Same result, obviously, but probably more transparent.

like image 132

Here's my vectorization of your function. I worked from the inside out, and commented out earlier versions as I went along. So the first loop that I vectorized has ### comment marks.

It isn't as clean and well reasoned as @Andras's answer, but hopefully it is instructional, giving an idea of how you can address this issue incrementally.

def foo2(data, threshold):
    CM_tilde = np.mean(data, axis=1)
    N = data.shape[1]
    #data_cm = np.zeros(( data.shape[0], data.shape[1], data.shape[2] ))
    ##all_CMs = np.zeros(( data.shape[0], data.shape[2]))
    bmask = data < (CM_tilde[:,None,:] + threshold)
    CM = np.zeros_like(data)
    CM[:] = CM_tilde[:,None,:]
    CM[bmask] = data[bmask]
    CM = CM.sum(axis=1)
    CM = CM/N
    all_CMs = CM.copy()
    """
    for frame in range(data.shape[2]):
        for row in range(data.shape[0]):
            ###print(frame, row)
            ###mask = data[row, :, frame] < (CM_tilde[row, frame]+threshold)
            ###print(mask)
            ##mask = bmask[row,:,frame]
            ##CM = data[row, mask, frame].sum()
            ##CM += (CM_tilde[row, frame]*(~mask)).sum()

            ##CM = CM/N
            ##all_CMs[row, frame] = CM
            ## calculate CM corrected value
            #for col in range(data.shape[1]):
            #    data_cm[row, col, frame] = data[row, col, frame] - CM[row,frame]
        print "frame: ", frame
    """
    data_cm = data - CM[:,None,:]
    return data_cm, all_CMs

Output matches for this small test case, which more than anything helped me get the dimensions right.

threshold = .1
data = np.arange(4*3*2,dtype=float).reshape(4,3,2)
like image 39
hpaulj Avatar answered Oct 18 '22 23:10

hpaulj