I have a function func
which is vectorizable, and I vectorize it using np.frompyfunc
. Rather than using a nested for
loop, I want to call it only once, and thus I'll need to pad the inputs with np.newaxis
's.
My goal is to get rid of the two nested for
loops and use the numpy.array
broadcasting feature instead.
Here is the MWE for
loops (I want to get rid of the for loops and instead pad the variables c_0
, c_1
, rn_1
, rn_2
, and factor
when calling myfunc
.
for i, b_0 in enumerate(b_00):
for j, b_1 in enumerate(b_11):
factor = b_0 + b_1
rn = (b_0 * coord + b_1 * coord2) / factor
rn_1 = coord - rn
rn_2 = coord2 - rn
results = np.add( results,np.prod(myfunc(c_0[:,None,:], c_1[None,:,:], rn_1, rn_2, factor), axis=2) )
The above explicit for loop is correct and is included in the block code for quality assurance.
factor = b_00[:, None] + b_11[None, :]
rn = np.add( (b_00[:,None] * coord[None,:])[:, None, :], (b_11[:,None] * coord2[None,:])[None, :, :] ) / factor[:,:,None]
rn_1 = coord - rn
rn_2 = coord2 - rn
results2 = np.prod(myfunc(c_0[:,None,:, None, None], c_1[None,:,:, None, None], rn_1[:,:,:],rn_2[:,:, :], factor[None,:, :, None]), axis=2)
results2 = np.squeeze(results2)
results2 = np.sum(results2, axis=(2,3))
import numpy as np
myfunc = np.frompyfunc(func,5,1)
################################################################
# Prep arrays needed for MWE
################################################################
results = np.zeros((5,3))
coord = np.array([1,1,2])
coord2 = np.array([3,3,3])
c_0 = np.array([[0,0,2],[0,2,0],[2,0,0],[1,1,0], [1,0,1]])
c_1 = np.array([[0,0,2],[0,2,0],[2,0,0]])
b_00 = np.array([2.2, 1.1, 4.4]) # np.array([2.2, 3.3, 40.4])
b_11 = np.array([1.2, 3.3]) # np.array([1.2, 5.3])
################################################################
# This is only for comparison. `results` is the correct answer
################################################################
for i, b_0 in enumerate(b_00):
for j, b_1 in enumerate(b_11):
factor = b_0 + b_1
rn = (b_0 * coord + b_1 * coord2) / factor
rn_1 = coord - rn
rn_2 = coord2 - rn
results = np.add( results,np.prod(myfunc(c_0[:,None,:], c_1[None,:,:], rn_1, rn_2, factor), axis=2) )
################################################################
# Prep for broadcasting (My attempt)
################################################################
factor = b_00[:, None] + b_11[None, :]
rn = np.add( (b_00[:,None] * coord[None,:])[:, None, :], (b_11[:,None] * coord2[None,:])[None, :, :] ) / factor[:,:,None]
rn_1 = coord - rn
rn_2 = coord2 - rn
# The following all get the same *wrong* answer
# results2 = np.prod(myfunc(c_0[:,None,:,None, None], c_1[None,:,:, None, None], rn_1[:, None, None],rn_2[:,None, None], factor[None,:, :]), axis=2)
# results2 = np.prod(myfunc(c_0[:,None,:, None, None, None], c_1[None,:,:, None, None, None], rn_1[None, None,:,:,:, None],rn_2[None, None,:,:, :, None], factor[None,:, :, None, None]), axis=2)
# results2 = np.prod(myfunc(c_0[:,None,:, None, None], c_1[None,:,:, None, None], rn_1[None, None,:,:,:],rn_2[None, None,:,:, :], factor[None,:, :, None]), axis=2)
# this should be the only line needing work!
results2 = np.prod(myfunc(c_0[:,None,:, None, None], c_1[None,:,:, None, None], rn_1[:,:,:],rn_2[:,:, :], factor[None,:, :, None]), axis=2)
results2 = np.squeeze(results2)
results2 = np.sum(results2, axis=(2,3))
assert np.allclose(results, results2)
# Vectorized function to be sent broadcasted arrays
def func(r0, r1, x, y, at):
val = 0.0
for i in range(r0+1):
for j in range(r1+1):
val += x + i*j + at * y
return val
results2
is my attempt at broadcasting, and results
is the slow for loop that gives the correct answer), but it has the wrong values.@hpaulj
pointed out, if I change the dimensions of b_00
to be length 4 (or anything larger as well), my solution ceases to even get the correct shape.Please make sure that it works for both the current b_00 = np.array([2.2, 1.1, 4.4])
as well as a more general b_00 = np.array([2.2, 1.1, 4.4, 5.1, 6.2])
. I would like a broadcasted solution, but will accept a solution that is simply faster than the for loops
.
The term broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.
The term broadcasting refers to the ability of NumPy to treat arrays of different shapes during arithmetic operations. Arithmetic operations on arrays are usually done on corresponding elements. If two arrays are of exactly the same shape, then these operations are smoothly performed.
So the term broadcasting comes from numpy, simply put it explains the rules of the output that will result when you perform operations between n-dimensional arrays (could be panels, dataframes, series) or scalar values.
This should solve your problem!
#Definitions
coord = np.array([1,1,2])
coord2 = np.array([3,3,3])
c_0 = np.array([[0,0,2],[0,2,0],[2,0,0],[1,1,0], [1,0,1]])
c_1 = np.array([[0,0,2],[0,2,0],[2,0,0]])
b_00 = np.array([2.2, 1.1, 4.4]) # np.array([2.2, 3.3, 40.4])
b_11 = np.array([1.2, 3.3]) # np.array([1.2, 5.3])
#Vectorized code for prep
b_0 = np.outer(b_00, coord)
b_1 = np.outer(b_11, coord2)
factor = (b_00+b_11.reshape(-1,1)).T[:,:,None]
rn = np.divide((b_0[:,None]+b_1), factor)
rn_1 = coord-rn
rn_2 = coord2-rn
#THIS IS YOUR INTERESTING PART
results = np.prod(myfunc(c_0[:,None,:,None,None], c_1[None,:,:,None,None], rn_1.transpose(2,0,1), rn_2.transpose(2,0,1), factor.transpose(2,0,1)).transpose(3,4,0,1,2), axis=4)
results = np.add.reduce(results, axis=(0,1))
results
array([[6707.793061061082, 5277.838468285241, 5277.838468285241],
[5277.838468285241, 5992.8157646731615, 5277.838468285241],
[5277.838468285241, 5277.838468285241, 5992.8157646731615],
[7037.117957713655, 7513.7694886389345, 7513.7694886389345],
[7990.421019564216, 7037.117957713655, 7513.7694886389345]],
dtype=object)
Just for understanding purposes, since in the old loop solution, myfunc operates on rn_1 and rn_2's first axis, the broadcasting channels need to be the last 2 instead of the first. So, I add 2 channels at the end of the c_0 and c_1 and then I bring the last axis to the front making the (3,2,3) rn_1 into (3,3,2). Similarly for Factor. So now the myfunc can operate on tensors with broadcasting channels highlighted - (5,1,3,1,1), (1,3,3,1,1), (3,3,2), (3,3,2), (1,3,2)
Finally, I transpose again to bring the broadcasted channels to the front, so that we have a (3,2,5,3,3) shaped tensor where first 2 channels are broadcasted versions of the 3,2 nested loops and the 5,3,3 is the matrix that you now need to np.prod over axis = 4 instead of axis = 2.
Post that, it was a simple matter of using np.add.reduce to take sum over the 0,1 axis bring the result to a 5,3 matrix. The final result should be exactly the same as the result of the loop.
I kinda took the liberty to modify the first part as well since that was more comfortable to my eyes, but feel free to ignore that part.
PS. Checked that it works seamlessly for the example you mentioned
b_00 = np.array([2.2, 1.1, 4.4, 5.1, 6.2])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With