I am trying to dive into the einsum notation. This question and answers have helped me a lot.
But now I can't grasp the machinery of the einsum
when calculating outer product:
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
np.einsum('i,j->ij', x, y)
array([[ 4., 5., 6.],
[ 8., 10., 12.],
[12., 15., 18.]])
That answer gives a following rule:
By repeating the label i in both input arrays, we are telling einsum that these two axes should be multiplied together.
I can't understand how this multiplication happened if we hadn't provided any repeated axis label in np.einsum('i,j->ij', x, y)
?
Could you please give a steps that np.einsum
took in this example?
Or maybe more broader question how einsum
works when no matching axis labels are given?
In the output of np.einsum('i,j->ij', x, y)
, element [i,j]
is simply the product of element i
in x
and element j
in y
. In other words, np.einsum('i,j->ij', x, y)[i,j] = x[i]*y[j]
.
Compare it to np.einsum('i,i->i', x, y)
were element i
of output is x[i]*y[i]
:
np.einsum('i,i->i', x, y)
[ 4 10 18]
And if a label in input is missing in output, it means the output has calculated the sum along the missing labels axis. Here is a simple example:
np.einsum('i,j->i', x, y)
[15 30 45]
Here the label j
in input is missing in output, which is equivalent to summation along axis=1
(corresponding to label j
):
np.sum(np.einsum('i,j->ij', x, y), axis=1)
[15 30 45]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With