How to vectorize a loop through a matrix numpy

Question

Suppose I have a matrix that is 100000 x 100

import numpy as np

mat = np.random.randint(2, size=(100000,100))

I wish to go through this matrix, and if each row contains entirely either 1 or 0 I wish to change a state variable to that value. If the state is not changed, I wish to set the entire row the value of state. The initial value of state is 0.

Naively in a for loop this can be done as follows

state = 0

for row in mat:
    if set(row) == {1}:
        state = 1
    elif set(row) == {0}:
        state = 0
    else:
        row[:] = state

However, when the size of the matrix increases this takes an impractical amount of time. Could someone point me in the direction in how to leverage numpy to vectorize this loop and speed it up?

So for a sample input

array([[0, 1, 0],
       [0, 0, 1],
       [1, 1, 1],
       [0, 0, 1],
       [0, 0, 1]])

The expected output in this case would be

array([[0, 0, 0],
       [0, 0, 0],
       [1, 1, 1],
       [1, 1, 1],
       [1, 1, 1]])

Divakar · Accepted Answer

Approach #1: NumPy-Vectorized

Here's a vectorized one -

def check_all(a, state): # a is input matrix/array
    # Get zeros and ones all masks
    zm = (a==0).all(1)
    om = (a==1).all(1)

    # "Attach" boundaries with False values at the start of these masks.
    # These will be used to detect rising edges (as indices) on these masks.
    zma = np.r_[False,zm]
    oma = np.r_[False,om]

    omi = np.flatnonzero(oma[:-1] < oma[1:])
    zmi = np.flatnonzero(zma[:-1] < zma[1:])

    # Group the indices and the signatures (values as 1s and -1s)
    ai = np.r_[omi,zmi]
    av = np.r_[np.ones(len(omi),dtype=int),-np.ones(len(zmi),dtype=int)]

    # Sort the grouped-indices, thus we would know the positions
    # of these group starts. Then index into the signatures/values
    # and indices with those, giving us the information on how these signatures
    # occur through the length of the input
    sidx = ai.argsort()
    val,aidx = av[sidx],ai[sidx]

    # The identical consecutive signatures are to be removed
    mask = np.r_[True,val[:-1]!=val[1:]]
    v,i = val[mask],aidx[mask]

    # Also, note that we are assigning all 1s as +1 signature and all 0s as -1
    # So, in case the starting signature is a 0, assign a value of 0
    if v[0]==-1:
        v[0] = 0

    # Initialize 1D o/p array, which stores the signatures as +1s and -1s.
    # The bigger level idea is that performing cumsum at the end would give us the
    # desired 1D output
    out1d = np.zeros(len(a),dtype=a.dtype)

    # Assign the values at i positions
    out1d[i] = v

    # Finally cumsum to get desired output
    out1dc = out1d.cumsum()

    # Correct the starting positions based on starting state value
    out1dc[:i[0]] = state

    # Convert to 2D view for mem. and perf. efficiency
    out = np.broadcast_to(out1dc[:,None],a.shape)
    return out

Approach #2: Numba-based

Here's another numba-based one for memory and hence perf. efficiency -

@njit(parallel=True)
def func1(zm, om, out, start_state, cur_state):
    # This outputs 1D version of required output.

    # Start off with the starting given state
    newval = start_state

    # Loop through zipped zeros-all and ones-all masks and in essence do :
    # Switch between zeros and ones based on whether the other ones
    # are occuring through or not, prior to the current state
    for i,(z,o) in enumerate(zip(zm,om)):
        if z and cur_state:
            cur_state = ~cur_state
            newval = 0
        if o and ~cur_state:
            cur_state = ~cur_state
            newval = 1
        out[i] = newval
    return out

def check_all_numba(a, state):
    # Get zeros and ones all masks
    zm = (a==0).all(1)
    om = (a==1).all(1)

    # Decide the starting state
    cur_state = zm.argmax() < om.argmax()

    # Initialize 1D o/p array with given state values
    out1d = np.full(len(a), fill_value=state)
    func1(zm, om, out1d, state, cur_state)

    # Broadcast into the 2D view for memory and perf. efficiency
    return np.broadcast_to(out1d[:,None],a.shape)

Alain T. · Answer

You can do this without any loops by leveraging np.accumulate:

R = 5 # 100000
C = 3 # 100

mat   = np.random.randint(2, size=(R,C))
print(mat) # original matrix

state    = np.zeros((1,C))                        # or np.ones((1,C))
mat      = np.concatenate([state,mat])            # insert state row
zRows    = np.isin(np.sum(mat,1),[0,C])           # all zeroes or all ones
iRows    = np.arange(R+1) * zRows.astype(np.int)  # base indexes
mat      = mat[np.maximum.accumulate(iRows)][1:]  # indirection, remove state
print(mat) # modified

#original
[[0 0 1]
 [1 1 1]
 [1 0 1]
 [0 0 0]
 [1 0 1]]

# modified
[[0 0 0]
 [1 1 1]
 [1 1 1]
 [0 0 0]
 [0 0 0]]

The way it works is by preparing an indirection array for rows that need to be changed. This is done from an np.arange of row indexes in which we set to zero the the indexes that will need replacement. Accumulating the maximum index will map each replaced row to an all-zero or all-one row before it.

For example:

  [ 0, 1, 2, 3, 4, 5 ] # row indexes
  [ 0, 1, 0, 0, 1, 0 ] # rows that are all zeroes or all ones (zRows)
  [ 0, 1, 0, 0, 4, 0 ] # multiplied (iRows)
  [ 0, 1, 1, 1, 4, 4 ] # np.maximum.accumulate

This gives us a list of indexes where row content should be taken from.

The state is represented by an extra row inserted at the beginning of the matrix before performing the operation and removed afterward.

This solution will be marginally slower for very small matrices (5x3) but it can give you a 20x speed boost for larger ones (100000x100: 0.7 second vs 14 seconds).

How to vectorize a loop through a matrix numpy

Tags:

performance

python

optimization

vectorization

numpy

mch56

2 Answers

Divakar

Alain T.

Recent Activity

Donate For Us

How to vectorize a loop through a matrix numpy

Tags:

performance

python

optimization

vectorization

numpy

mch56

2 Answers

Divakar

Alain T.

Related questions

Recent Activity

Donate For Us