I have to multiply very large 2D-arrays in Python for around 100 times. Each matrix consists of 32000x32000 elements.
I'm using np.dot(X,Y), but it takes very long time for each multiplication... Below an instance of my code:
import numpy as np
X = None
for i in range(100)
multiplying = True
if X == None:
X = generate_large_2darray()
multiplying = False
else:
Y = generate_large_2darray()
if multiplying:
X = np.dot(X, Y)
Is there any other method much faster?
Update
Here is a screenshot showing the htop interface. My python script is using only one core. Also, after 3h25m only 4 multiplications have been done.

Update 2
I've tried to execute:
import numpy.distutils.system_info as info
info.get_info('atlas')
but I've received:
/home/francescof/.local/lib/python2.7/site-packages/numpy/distutils/system_info.py:564: UserWarning: Specified path /home/apy/atlas/lib is invalid. warnings.warn('Specified path %s is invalid.' % d) {}
So, I think it's not well-configured.
Vice versa, regarding blas I just receive {}, with no warnings or errors.
As suggested by ali_m, the using of a BLAS library can speed up the operations. However, the problem in my system was a bad configuration of numpy. Here is the solution:
1) make sure to have all required libraries (you can use ATLAS, OpenBLAS, etc.). I've chosen ATLAS in my case since directly supported in Ubuntu.
sudo apt-get install libatlas3gf-base libatlas-base-dev libatlas-dev
2) remove any previous numpy installations, e.g., pypm uninstall numpy (if you installed it using ActivePython)
3) install again numpy using pip: pip install numpy
4) make sure your atlas is correctly linked:
import numpy.distutils.system_info as info
info.get_info('atlas')
ATLAS version 3.8.4 built by buildd on Sat Sep 10 23:12:12 UTC 2011:
UNAME : Linux crested 2.6.24-29-server #1 SMP Wed Aug 10 15:58:57 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
INSTFLG : -1 0 -a 1
ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_HAMMER -DATL_CPUMHZ=1993 -DATL_USE64BITS -DATL_GAS_x8664
F2CDEFS : -DAdd_ -DF77_INTEGER=int -DStringSunStyle
CACHEEDGE: 393216
F77 : gfortran, version GNU Fortran (Ubuntu/Linaro 4.6.1-9ubuntu2) 4.6.1
F77FLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -Wa,--noexecstack -fPIC -m64
SMC : gcc, version gcc (Ubuntu/Linaro 4.6.1-9ubuntu2) 4.6.1
SMCFLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -Wa,--noexecstack -fPIC -m64
SKC : gcc, version gcc (Ubuntu/Linaro 4.6.1-9ubuntu2) 4.6.1
SKCFLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -Wa,--noexecstack -fPIC -m64
{'libraries': ['lapack', 'f77blas', 'cblas', 'atlas'], 'library_dirs': ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base'], 'define_macros': [('ATLAS_INFO', '"\\"3.8.4\\""')], 'language': 'f77', 'include_dirs': ['/usr/include/atlas']}
Matrix multiplication is always expensive, specifically around O(n3). Performing this operation in Numpy is probably the fastest way to deal with it short of writing your own matrix multiplier in a compiled program that is "closer to the metal" (like C)... this would probably still be slower. I think you are doing this operation in the best way but you must realize that a 32000x32000 matrix is very large to be preforming any operations on, let alone matrix multiplication.
That was the bad news but here is the good news. I don't know what type of data you are working with but there can be, and often are, symmetries of the matrices in question which can greatly simplify the calculation. If your data is not entirely random there may be hope but you will have to look into the actual structure of the matrices you are working with. I suggest reading about some of the "special matrices" to see if your data might fall into one of those categories. Any information you find on the category your data should also discuss or cite efficient algorithms for managing expensive operations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With