I have a piece of code, but I want to pull up the performance. My code is:
lis = []
for i in range(6):
for j in range(6):
for k in range(6):
for l in range(6):
lis[i][j] += matrix1[k][l] * (2 * matrix2[i][j][k][l] - matrix2[i][k][j][l])
print(lis)
matrix2 is a 4-dimensional np-array, and matrix1 is a 2d-array.
I want to speed up this code by using np.tensordot(matrix1, matrix2), but then I'm lost.
You can just use a jit-compiler
Your solution isn't bad at all. The only thing I have changed is the indexing and variable loop ranges. If you have numpy arrays and excessive looping you can use a compiler (Numba), which is a really simple thing to do.
import numba as nb
import numpy as np
#The function is compiled only at the first call (with using same datatypes)
@nb.njit(cache=True) #set cache to false if copying the function to a command window
def almost_your_solution(matrix1,matrix2):
lis = np.zeros(matrix1.shape,np.float64)
for i in range(matrix2.shape[0]):
for j in range(matrix2.shape[1]):
for k in range(matrix2.shape[2]):
for l in range(matrix2.shape[3]):
lis[i,j] += matrix1[k,l] * (2 * matrix2[i,j,k,l] - matrix2[i,k,j,l])
return lis
Regarding code simplicity I would prefer the einsum solution from hpaulj over the solution shown above. The tensordot solution isn't that easy to understand to my opinion. But that's a a matter of taste.
Comparing performance
The function from hpaulj i used for comparison:
def hpaulj_1(matrix1,matrix2):
matrix3 = 2*matrix2-matrix2.transpose(0,2,1,3)
return np.einsum('kl,ijkl->ij', matrix1, matrix3)
def hpaulj_2(matrix1,matrix2):
matrix3 = 2*matrix2-matrix2.transpose(0,2,1,3)
(matrix1*matrix3).sum(axis=(2,3))
return np.tensordot(matrix1, matrix3, [[0,1],[2,3]])
Very short arrays gives:
matrix1=np.random.rand(6,6)
matrix2=np.random.rand(6,6,6,6)
Original solution: 2.6 ms
Compiled solution: 2.1 µs
Einsum solution: 8.3 µs
Tensordot solution: 36.7 µs
Larger arrays gives:
matrix1=np.random.rand(60,60)
matrix2=np.random.rand(60,60,60,60)
Original solution: 13,3 s
Compiled solution: 18.2 ms
Einsum solution: 115 ms
Tensordot solution: 180 ms
Conclusion
Compilation speeds up the computation by about 3 orders of magnitude and outperforms all other solutions by quite a margin.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With