Suppose I have a matrix that is 100000 x 100
import numpy as np
mat = np.random.randint(2, size=(100000,100))
I wish to go through this matrix, and if each row
contains entirely either 1 or 0 I wish to change a state
variable to that value. If the state is not changed, I wish to set the entire row
the value of state
. The initial value of state
is 0.
Naively in a for
loop this can be done as follows
state = 0
for row in mat:
if set(row) == {1}:
state = 1
elif set(row) == {0}:
state = 0
else:
row[:] = state
However, when the size of the matrix increases this takes an impractical amount of time. Could someone point me in the direction in how to leverage numpy
to vectorize this loop and speed it up?
So for a sample input
array([[0, 1, 0],
[0, 0, 1],
[1, 1, 1],
[0, 0, 1],
[0, 0, 1]])
The expected output in this case would be
array([[0, 0, 0],
[0, 0, 0],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
Approach #1: NumPy-Vectorized
Here's a vectorized one -
def check_all(a, state): # a is input matrix/array
# Get zeros and ones all masks
zm = (a==0).all(1)
om = (a==1).all(1)
# "Attach" boundaries with False values at the start of these masks.
# These will be used to detect rising edges (as indices) on these masks.
zma = np.r_[False,zm]
oma = np.r_[False,om]
omi = np.flatnonzero(oma[:-1] < oma[1:])
zmi = np.flatnonzero(zma[:-1] < zma[1:])
# Group the indices and the signatures (values as 1s and -1s)
ai = np.r_[omi,zmi]
av = np.r_[np.ones(len(omi),dtype=int),-np.ones(len(zmi),dtype=int)]
# Sort the grouped-indices, thus we would know the positions
# of these group starts. Then index into the signatures/values
# and indices with those, giving us the information on how these signatures
# occur through the length of the input
sidx = ai.argsort()
val,aidx = av[sidx],ai[sidx]
# The identical consecutive signatures are to be removed
mask = np.r_[True,val[:-1]!=val[1:]]
v,i = val[mask],aidx[mask]
# Also, note that we are assigning all 1s as +1 signature and all 0s as -1
# So, in case the starting signature is a 0, assign a value of 0
if v[0]==-1:
v[0] = 0
# Initialize 1D o/p array, which stores the signatures as +1s and -1s.
# The bigger level idea is that performing cumsum at the end would give us the
# desired 1D output
out1d = np.zeros(len(a),dtype=a.dtype)
# Assign the values at i positions
out1d[i] = v
# Finally cumsum to get desired output
out1dc = out1d.cumsum()
# Correct the starting positions based on starting state value
out1dc[:i[0]] = state
# Convert to 2D view for mem. and perf. efficiency
out = np.broadcast_to(out1dc[:,None],a.shape)
return out
Approach #2: Numba-based
Here's another numba-based one for memory and hence perf. efficiency -
@njit(parallel=True)
def func1(zm, om, out, start_state, cur_state):
# This outputs 1D version of required output.
# Start off with the starting given state
newval = start_state
# Loop through zipped zeros-all and ones-all masks and in essence do :
# Switch between zeros and ones based on whether the other ones
# are occuring through or not, prior to the current state
for i,(z,o) in enumerate(zip(zm,om)):
if z and cur_state:
cur_state = ~cur_state
newval = 0
if o and ~cur_state:
cur_state = ~cur_state
newval = 1
out[i] = newval
return out
def check_all_numba(a, state):
# Get zeros and ones all masks
zm = (a==0).all(1)
om = (a==1).all(1)
# Decide the starting state
cur_state = zm.argmax() < om.argmax()
# Initialize 1D o/p array with given state values
out1d = np.full(len(a), fill_value=state)
func1(zm, om, out1d, state, cur_state)
# Broadcast into the 2D view for memory and perf. efficiency
return np.broadcast_to(out1d[:,None],a.shape)
You can do this without any loops by leveraging np.accumulate:
R = 5 # 100000
C = 3 # 100
mat = np.random.randint(2, size=(R,C))
print(mat) # original matrix
state = np.zeros((1,C)) # or np.ones((1,C))
mat = np.concatenate([state,mat]) # insert state row
zRows = np.isin(np.sum(mat,1),[0,C]) # all zeroes or all ones
iRows = np.arange(R+1) * zRows.astype(np.int) # base indexes
mat = mat[np.maximum.accumulate(iRows)][1:] # indirection, remove state
print(mat) # modified
#original
[[0 0 1]
[1 1 1]
[1 0 1]
[0 0 0]
[1 0 1]]
# modified
[[0 0 0]
[1 1 1]
[1 1 1]
[0 0 0]
[0 0 0]]
The way it works is by preparing an indirection array for rows that need to be changed. This is done from an np.arange of row indexes in which we set to zero the the indexes that will need replacement. Accumulating the maximum index will map each replaced row to an all-zero or all-one row before it.
For example:
[ 0, 1, 2, 3, 4, 5 ] # row indexes
[ 0, 1, 0, 0, 1, 0 ] # rows that are all zeroes or all ones (zRows)
[ 0, 1, 0, 0, 4, 0 ] # multiplied (iRows)
[ 0, 1, 1, 1, 4, 4 ] # np.maximum.accumulate
This gives us a list of indexes where row content should be taken from.
The state is represented by an extra row inserted at the beginning of the matrix before performing the operation and removed afterward.
This solution will be marginally slower for very small matrices (5x3) but it can give you a 20x speed boost for larger ones (100000x100: 0.7 second vs 14 seconds).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With