Can someone please explain how broadcasting (ellipsis) works in the numpy.einsum() function?
Some examples to show how and when it can be used would be greatly appreciated.
I've checked the following official documentation page but there are only 2 examples and I can't seem to understand how to interpret it and use it.
http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.einsum.html
The term broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.
einsum. Evaluates the Einstein summation convention on the operands. Using the Einstein summation convention, many common multi-dimensional, linear algebraic array operations can be represented in a simple fashion.
einsum is clearly faster. Actually, twice as fast as numpy's built-in functions and, well, 6 times faster than loops, in this case.
To use numpy. einsum() , all you have to do is to pass the so-called subscripts string as an argument, followed by your input arrays. Let's say you have two 2D arrays, A and B , and you want to do matrix multiplication.
The ellipses are a shorthand roughly standing for "all the remaining axes not explicitly mentioned". For example, suppose you had an array of shape (2,3,4,5,6,6):
import numpy as np
arr = np.random.random((2,3,4,5,6,6))
and you wish to take a trace along its last two axes:
result = np.einsum('ijklmm->ijklm', arr)
result.shape
# (2, 3, 4, 5, 6)
An equivalent way to do that would be
result2 = np.einsum('...mm->...m', arr)
assert np.allclose(result, result2)
The ellipses provide a shorthand notation meaning (in this case) "and all the
axes to the left". The ...
stand for ijkl
.
One nice thing about not having to be explicit is that
np.einsum('...mm->...m', arr)
works equally well with arrays of any number of dimensions >= 2 (so long as the last two have equal length), whereas
np.einsum('ijklmm->ijklm', arr)
only works when arr
has exactly 6 dimensions.
When the ellipses appear in the middle, it is shorthand for "all the middle axes
not explicitly mentioned". For example, below, np.einsum('ijklmi->ijklm', arr)
is equivalent to np.einsum('i...i->i...', arr)
. Here the ...
stand for jklm
:
arr = np.random.random((6,2,3,4,5,6))
result = np.einsum('ijklmi->ijklm', arr)
result2 = np.einsum('i...i->i...', arr)
assert np.allclose(result, result2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With