I am trying to dive into the einsum notation. This question and answers have helped me a lot.
But now I can't grasp the machinery of the einsum when calculating outer product:
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
np.einsum('i,j->ij', x, y)
array([[ 4., 5., 6.],
[ 8., 10., 12.],
[12., 15., 18.]])
That answer gives a following rule:
By repeating the label i in both input arrays, we are telling einsum that these two axes should be multiplied together.
I can't understand how this multiplication happened if we hadn't provided any repeated axis label in np.einsum('i,j->ij', x, y)?
Could you please give a steps that np.einsum took in this example?
Or maybe more broader question how einsum works when no matching axis labels are given?
In the output of np.einsum('i,j->ij', x, y), element [i,j] is simply the product of element i in x and element j in y. In other words, np.einsum('i,j->ij', x, y)[i,j] = x[i]*y[j].
Compare it to np.einsum('i,i->i', x, y) were element i of output is x[i]*y[i]:
np.einsum('i,i->i', x, y)
[ 4 10 18]
And if a label in input is missing in output, it means the output has calculated the sum along the missing labels axis. Here is a simple example:
np.einsum('i,j->i', x, y)
[15 30 45]
Here the label j in input is missing in output, which is equivalent to summation along axis=1 (corresponding to label j):
np.sum(np.einsum('i,j->ij', x, y), axis=1)
[15 30 45]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With