greetings,
I'm not sure if this is a dumb question or not.
Lets say I have 3 numpy arrays, A1,A2,A3, and 3 floats, c1,c2,c3
and I'd like to evaluate B = A1*c1+ A2*c2+ A3*c3
will numpy compute this as for example,
E1 = A1*c1
E2 = A2*c2
E3 = A3*c3
D1 = E1+E2
B = D1+E3
or is it more clever than that? In c++ I had a neat way to abstract this kind of operation.
I defined series of general 'LC' template functions, LC for linear combination like:
template<class T,class D>
void LC( T & R,
T & L0,D C0,
T & L1,D C1,
T & L2,D C2)
{
R = L0*C0
+L1*C1
+L2*C2;
}
and then specialized this for various types,
so for instance, for an array the code looked like
for (int i=0; i<L0.length; i++)
R.array[i] =
L0.array[i]*C0 +
L1.array[i]*C1 +
L2.array[i]*C2;
thus avoiding having to create new intermediate arrays.
This may look messy but it worked really well.
I could do something similar in python, but I'm not sure if its nescesary.
Thanks in advance for any insight. -nick
While numpy
, in theory, could at any time always upgrade its internals to perform wondrous optimizations, at the present time it does not: B = A1*c1 + A2*c2 + A3*c3
will indeed produce and then discard intermediate temporary arrays ("spending" some auxiliary memory, of course -- nothing else).
B = A1 * c1
followed by B += A2 * c2; B += A3 * c3
, again at this time, will therefore avoid spending some of that temporary memory.
Of course, you'll be able to tell the difference only if you're operating in an environment with scarce real memory (where some of that auxiliary memory is just virtual and leads to page faults) and for sufficiently large arrays to "spend" all real memory and then some. Under such extreme conditions, however, a little refactoring can buy you some performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With