When optimising slow parts of my code, I was surprised by the fact that A.sum()
is almost twice as fast as A.max()
:
In [1]: A = arange(10*20*30*40).reshape(10, 20, 30, 40)
In [2]: %timeit A.max()
1000 loops, best of 3: 216 us per loop
In [3]: %timeit A.sum()
10000 loops, best of 3: 119 us per loop
In [4]: %timeit A.any()
1000 loops, best of 3: 217 us per loop
I had expected that A.any()
would be much faster (it should need to check only one element!), followed by A.max()
, and that A.sum()
would be the slowest (sum()
needs to add numbers and update a value every time, max
needs to compare numbers every time and update sometimes, and I thought adding should be slower than comparing). In fact, it's the opposite. Why?
max
has to store a value, continuously checking for potential updates (and the CPU needs to do branche operations to effect these). sum
just churns through the values.
So sum
will be quicker.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With