Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compiling numpy with OpenBLAS integration

I am trying to install numpy with OpenBLAS , however I am at loss as to how the site.cfg file needs to be written.

When the installation procedure was followed the installation completed without errors, however there is performance degradation on increasing the number of threads used by OpenBLAS from 1 (controlled by the environment variable OMP_NUM_THREADS).

I am not sure if the OpenBLAS integration has been perfect. Could any one provide a site.cfg file to achieve the same.

P.S.: OpenBLAS integration in other toolkits like Theano, which is based on Python, provides substantial performance boost on increasing the number of threads, on the same machine.

like image 971
Vijay Avatar asked Jul 11 '12 23:07

Vijay


People also ask

Is NumPy automatically installed with Python?

The only prerequisite for installing NumPy is Python itself. If you don't have Python yet and want the simplest way to get started, we recommend you use the Anaconda Distribution - it includes Python, NumPy, and many other commonly used packages for scientific computing and data science.

Does NumPy use OpenBLAS?

NumPy does not require any external linear algebra libraries to be installed. However, if these are available, NumPy's setup script can detect them and use them for building. A number of different LAPACK library setups can be used, including optimized LAPACK libraries such as OpenBLAS or MKL.


2 Answers

I just compiled numpy inside a virtualenv with OpenBLAS integration, and it seems to be working OK.

This was my process:

  1. Compile OpenBLAS:

    $ git clone https://github.com/xianyi/OpenBLAS $ cd OpenBLAS && make FC=gfortran $ sudo make PREFIX=/opt/OpenBLAS install 

    If you don't have admin rights you could set PREFIX= to a directory where you have write privileges (just modify the corresponding steps below accordingly).

  2. Make sure that the directory containing libopenblas.so is in your shared library search path.

    • To do this locally, you could edit your ~/.bashrc file to contain the line

      export LD_LIBRARY_PATH=/opt/OpenBLAS/lib:$LD_LIBRARY_PATH 

      The LD_LIBRARY_PATH environment variable will be updated when you start a new terminal session (use $ source ~/.bashrc to force an update within the same session).

    • Another option that will work for multiple users is to create a .conf file in /etc/ld.so.conf.d/ containing the line /opt/OpenBLAS/lib, e.g.:

      $ sudo sh -c "echo '/opt/OpenBLAS/lib' > /etc/ld.so.conf.d/openblas.conf" 

    Once you are done with either option, run

    $ sudo ldconfig 
  3. Grab the numpy source code:

    $ git clone https://github.com/numpy/numpy $ cd numpy 
  4. Copy site.cfg.example to site.cfg and edit the copy:

    $ cp site.cfg.example site.cfg $ nano site.cfg 

    Uncomment these lines:

    .... [openblas] libraries = openblas library_dirs = /opt/OpenBLAS/lib include_dirs = /opt/OpenBLAS/include .... 
  5. Check configuration, build, install (optionally inside a virtualenv)

    $ python setup.py config 

    The output should look something like this:

    ... openblas_info:   FOUND:     libraries = ['openblas', 'openblas']     library_dirs = ['/opt/OpenBLAS/lib']     language = c     define_macros = [('HAVE_CBLAS', None)]    FOUND:     libraries = ['openblas', 'openblas']     library_dirs = ['/opt/OpenBLAS/lib']     language = c     define_macros = [('HAVE_CBLAS', None)] ... 

    Installing with pip is preferable to using python setup.py install, since pip will keep track of the package metadata and allow you to easily uninstall or upgrade numpy in the future.

    $ pip install . 
  6. Optional: you can use this script to test performance for different thread counts.

    $ OMP_NUM_THREADS=1 python build/test_numpy.py  version: 1.10.0.dev0+8e026a2 maxint:  9223372036854775807  BLAS info:  * libraries ['openblas', 'openblas']  * library_dirs ['/opt/OpenBLAS/lib']  * define_macros [('HAVE_CBLAS', None)]  * language c  dot: 0.099796795845 sec  $ OMP_NUM_THREADS=8 python build/test_numpy.py  version: 1.10.0.dev0+8e026a2 maxint:  9223372036854775807  BLAS info:  * libraries ['openblas', 'openblas']  * library_dirs ['/opt/OpenBLAS/lib']  * define_macros [('HAVE_CBLAS', None)]  * language c  dot: 0.0439578056335 sec 

There seems to be a noticeable improvement in performance for higher thread counts. However, I haven't tested this very systematically, and it's likely that for smaller matrices the additional overhead would outweigh the performance benefit from a higher thread count.

like image 150
ali_m Avatar answered Sep 23 '22 23:09

ali_m


Just in case you are using ubuntu or mint, you can easily have openblas linked numpy by installing both numpy and openblas via apt-get as

sudo apt-get install numpy libopenblas-dev 

On a fresh docker ubuntu, I tested the following script copied from the blog post "Installing Numpy and OpenBLAS"

import numpy as np import numpy.random as npr import time  # --- Test 1 N = 1 n = 1000  A = npr.randn(n,n) B = npr.randn(n,n)  t = time.time() for i in range(N):     C = np.dot(A, B) td = time.time() - t print("dotted two (%d,%d) matrices in %0.1f ms" % (n, n, 1e3*td/N))  # --- Test 2 N = 100 n = 4000  A = npr.randn(n) B = npr.randn(n)  t = time.time() for i in range(N):     C = np.dot(A, B) td = time.time() - t print("dotted two (%d) vectors in %0.2f us" % (n, 1e6*td/N))  # --- Test 3 m,n = (2000,1000)  A = npr.randn(m,n)  t = time.time() [U,s,V] = np.linalg.svd(A, full_matrices=False) td = time.time() - t print("SVD of (%d,%d) matrix in %0.3f s" % (m, n, td))  # --- Test 4 n = 1500 A = npr.randn(n,n)  t = time.time() w, v = np.linalg.eig(A) td = time.time() - t print("Eigendecomp of (%d,%d) matrix in %0.3f s" % (n, n, td)) 

Without openblas the result is:

dotted two (1000,1000) matrices in 563.8 ms dotted two (4000) vectors in 5.16 us SVD of (2000,1000) matrix in 6.084 s Eigendecomp of (1500,1500) matrix in 14.605 s 

After I installed openblas with apt install openblas-dev, I checked the numpy linkage with

import numpy as np np.__config__.show() 

and the information is

atlas_threads_info:   NOT AVAILABLE openblas_info:   NOT AVAILABLE atlas_blas_info:   NOT AVAILABLE atlas_3_10_threads_info:   NOT AVAILABLE blas_info:     library_dirs = ['/usr/lib']     libraries = ['blas', 'blas']     language = c     define_macros = [('HAVE_CBLAS', None)] mkl_info:   NOT AVAILABLE atlas_3_10_blas_threads_info:   NOT AVAILABLE atlas_3_10_blas_info:   NOT AVAILABLE openblas_lapack_info:   NOT AVAILABLE lapack_opt_info:     library_dirs = ['/usr/lib']     libraries = ['lapack', 'lapack', 'blas', 'blas']     language = c     define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)] blas_opt_info:     library_dirs = ['/usr/lib']     libraries = ['blas', 'blas']     language = c     define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)] atlas_info:   NOT AVAILABLE blas_mkl_info:   NOT AVAILABLE lapack_mkl_info:   NOT AVAILABLE atlas_3_10_info:   NOT AVAILABLE lapack_info:     library_dirs = ['/usr/lib']     libraries = ['lapack', 'lapack']     language = f77 atlas_blas_threads_info:   NOT AVAILABLE 

It doesn't show linkage to openblas. However, the new result of the script shows that numpy must have used openblas:

dotted two (1000,1000) matrices in 15.2 ms dotted two (4000) vectors in 2.64 us SVD of (2000,1000) matrix in 0.469 s Eigendecomp of (1500,1500) matrix in 2.794 s 
like image 39
entron Avatar answered Sep 22 '22 23:09

entron