I got some working code using einsum function. But as einsum is currently still like <code>black voodoo</code> for me. I was wondering, what this code actually is doing and if it can be somehow optimized using <code>np.dot</code> My data looks likes this <pre class="prettyprint"><code>n, p, q = 40000, 8, 4 a = np.random.rand(n, p, q) b = np.random.rand(n, p) </code></pre> And my existing functions einsum functions looks like this <pre class="prettyprint"><code>f1 = np.einsum("ijx,ijy->ixy", a, a) f2 = np.einsum("ijx,ij->ix", a, b) </code></pre> But what does it really do? I get till here: each dimension (axis) is represented by a label, <code>i</code> is equal to the first axis <code>n</code>, <code>j</code> for the 2nd axis <code>p</code> and <code>x</code> and <code>y</code> are different labels for the same axis <code>q</code>. So the order of the output array of <code>f1</code> is <code>ixy</code> and thus the output shape is <code>40000,4,4 (n,q,q)</code> But that's as far as I get. And

Lets play around with a couple of small arrays <pre class="prettyprint"><code>In [110]: a=np.arange(2*3*4).reshape(2,3,4) In [111]: b=np.arange(2*3).reshape(2,3) In [112]: np.einsum('ijx,ij->ix',a,b) Out[112]: array([[ 20, 23, 26, 29], [200, 212, 224, 236]]) In [113]: np.diagonal(np.dot(b,a)).T Out[113]: array([[ 20, 23, 26, 29], [200, 212, 224, 236]]) </code></pre> <code>np.dot</code> operates on the last dim of the 1st array, and 2nd to the last of the 2nd. So I have to switch the arguments so the <code>3</code> dimension lines up. <code>dot(b,a)</code> produces a (2,2,4) array. <code>diagonal</code> selects 2 of those 'rows', and transpose to clean up. Another <code>einsum</code> expresses that cleanup nicely: <pre class="prettyprint"><code>In [122]: np.einsum('iik->ik',np.dot(b,a)) </code></pre> Since <code>np.dot</code> is producing a larger array than the original <code>einsum</code>, it is unlikely to be faster, even if the underlying C code is tighter. (Curiously I'm having trouble replicating <code>np.dot(b,a)</code> with <code>einsum</code>; it won't generate that (2,2,...) array). For the <code>a,a</code> case we have to do something similar - roll the axes of one array so the last dimension lines up with the 2nd to last of the other, do the <code>dot</code>, and then cleanup with <code>diagonal</code> and <code>transpose</code>: <pre class="prettyprint"><code>In [157]: np.einsum('ijx,ijy->ixy',a,a).shape Out[157]: (2, 4, 4) In [158]: np.einsum('ijjx->jix',np.dot(np.rollaxis(a,2),a)) In [176]: np.diagonal(np.dot(np.rollaxis(a,2),a),0,2).T </code></pre> <code>tensordot</code> is another way of taking a <code>dot</code> over selected axes. <pre class="prettyprint"><code>np.tensordot(a,a,(1,1)) np.diagonal(np.rollaxis(np.tensordot(a,a,(1,1)),1),0,2).T # with cleanup </code></pre>

Black voodoo of NumPy Einsum

Tags:

python

numpy

I got some working code using einsum function. But as einsum is currently still like black voodoo for me. I was wondering, what this code actually is doing and if it can be somehow optimized using np.dot

My data looks likes this

Click to copy

n, p, q = 40000, 8, 4
a = np.random.rand(n, p, q)
b = np.random.rand(n, p)

And my existing functions einsum functions looks like this

Click to copy

f1 = np.einsum("ijx,ijy->ixy", a, a)
f2 = np.einsum("ijx,ij->ix", a, b)

But what does it really do? I get till here: each dimension (axis) is represented by a label, i is equal to the first axis n, j for the 2nd axis p and x and y are different labels for the same axis q. So the order of the output array of f1 is ixy and thus the output shape is 40000,4,4 (n,q,q)

But that's as far as I get. And

446

asked Nov 26 '14 09:11

Mattijn

1 Answers

Lets play around with a couple of small arrays

Click to copy

In [110]: a=np.arange(2*3*4).reshape(2,3,4)

In [111]: b=np.arange(2*3).reshape(2,3)

In [112]: np.einsum('ijx,ij->ix',a,b)
Out[112]: 
array([[ 20,  23,  26,  29],
       [200, 212, 224, 236]])

In [113]: np.diagonal(np.dot(b,a)).T
Out[113]: 
array([[ 20,  23,  26,  29],
       [200, 212, 224, 236]])

np.dot operates on the last dim of the 1st array, and 2nd to the last of the 2nd. So I have to switch the arguments so the 3 dimension lines up. dot(b,a) produces a (2,2,4) array. diagonal selects 2 of those 'rows', and transpose to clean up. Another einsum expresses that cleanup nicely:

Click to copy

In [122]: np.einsum('iik->ik',np.dot(b,a))

Since np.dot is producing a larger array than the original einsum, it is unlikely to be faster, even if the underlying C code is tighter.

(Curiously I'm having trouble replicating np.dot(b,a) with einsum; it won't generate that (2,2,...) array).

For the a,a case we have to do something similar - roll the axes of one array so the last dimension lines up with the 2nd to last of the other, do the dot, and then cleanup with diagonal and transpose:

Click to copy

In [157]: np.einsum('ijx,ijy->ixy',a,a).shape
Out[157]: (2, 4, 4)
In [158]: np.einsum('ijjx->jix',np.dot(np.rollaxis(a,2),a))
In [176]: np.diagonal(np.dot(np.rollaxis(a,2),a),0,2).T

tensordot is another way of taking a dot over selected axes.

Click to copy

np.tensordot(a,a,(1,1))
np.diagonal(np.rollaxis(np.tensordot(a,a,(1,1)),1),0,2).T  # with cleanup

answered Oct 13 '22 06:10

hpaulj

Related questions
                            
                                pip says modules "weren't found" to uninstall, but pip list shows them
                            
                                Qt and PyQt hybrid application [closed]
                            
                                Using etcd to manage Django settings
                            
                                What is the fastest way to extract given rows and columns from a Numpy ndarray?
                            
                                Python ttk.combobox force post/open
                            
                                Show the page while testing with PhantomJS through Selenium
                            
                                Changing the bit-depth of figures produced using Matplotlib
                            
                                How do you insert Google Glass Mirror credentials from python server side code?
                            
                                PyCharm requirements.txt install fails with private GitHub repository and SSH keys
                            
                                Solve Differential equation using Python PyDDE solver
                            
                                import anaconda packages to IDLE?
                            
                                Slow Mac when sending input to an inferior python process
                            
                                How do I embed an Ipython Notebook in an iframe (new)
                            
                                Using celery with Flask app context gives "Popped wrong app context." AssertionError
                            
                                Python adding a blank/empty column. csv
                            
                                Can switching in-and-out PyFrameObjects be a good implementation of continuations?
                            
                                Run script within python package
                            
                                Turkish character encoding
                            
                                redirect to last page not working on python social auth
                            
                                Skype4Py MessageStatus not firing consistently

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Black voodoo of NumPy Einsum

Tags:

python

numpy

Mattijn

People also ask

1 Answers

hpaulj

Recent Activity

Donate For Us