Like many others, I've bought myself a new Ryzen CPU. I need to use Anaconda Python for my PhD (together with Tensorflow etc). Since Anaconda now comes pre-packaged with MKL which is slow on AMD CPUs, what is the best way to setup an Anaconda environment with openblas, and link numpy and scikit-learn, while keeping all other packages the same?
I've found the following posts which all points to installing some packages one way or another.
https://anaconda.org/anaconda/nomkl
https://anaconda.org/anaconda/openblas
How to install scipy without mkl
While MKL does works on AMD CPUs, for competitive reasons, MKL checks whether the CPU on a system is made by Intel and forces the use of much slower routines otherwise.
Python 3.2.... 64 will also work on processors that implement the Intel 64 architecture (formerly EM64T), i.e. the architecture that Microsoft calls x64, and AMD called x86-64 before calling it AMD64.
From Version 2.2 AMD will stand for Arduino-Matplotlib-DataScience since I've implemented various functions as in the new expansion in this single module.
An alternate to giving up MKL is simply to make it run much faster on a Ryzen CPU by telling MKL to use a more Ryzen-compatible instruction set. By doing
conda install mkl -c intel --no-update-deps
set MKL_DEBUG_CPU_TYPE=5
I saw about a 15x speedup using numpy/theano/PyMC3 on my Ryzen CPU under Windows 10 vs the default initial miniconda installation.
This post from reddit has a much more thorough explanation of what's going on, but it's just a one liner in your terminal to trick MKL into thinking you are an Intel system since MKL does nasty things to non Intel devices: https://www.reddit.com/r/MachineLearning/comments/f2pbvz/discussion_workaround_for_mkl_on_amd/
WINDOWS:
opening a command prompt (CMD) with admin rights and typing in:
setx /M MKL_DEBUG_CPU_TYPE 5
Doing this will make the change permanent and available to ALL Programs using the MKL on your system until you delete the entry again from the variables.
LINUX:
Simply type in a terminal:
export MKL_DEBUG_CPU_TYPE=5
before running your script from the same instance of the terminal.
Permanent solution for Linux:
echo 'export MKL_DEBUG_CPU_TYPE=5' >> ~/.profile
will apply the setting profile-wide.
Some highlights since I figure you can click the link to read the entire thing if interested:
"However, the numerical lib that comes with many of your packages by default is the Intel MKL. The MKL runs notoriously slow on AMD CPUs for some operations. This is because the Intel MKL uses a discriminative CPU Dispatcher that does not use efficient codepath according to SIMD support by the CPU, but based on the result of a vendor string query. If the CPU is from AMD, the MKL does not use SSE3-SSE4 or AVX1/2 extensions but falls back to SSE no matter whether the AMD CPU supports more efficient SIMD extensions like AVX2 or not.
The method provided here enforces AVX2 support by the MKL, independent of the vendor string result and takes less than a minute to apply. If you have an AMD CPU that is based on the Zen/Zen+/Zen2 µArch Ryzen/Threadripper, this will boost your performance tremendously."
As of 2021, Intel unfortunately removed the MKL_DEBUG_CPU_TYPE
to prevent people on AMD use the workaround presented in the accepted answer. This means that the workaround no longer works, and AMD users have to either switch to OpenBLAS or keep using MKL.
To use the workaround, follow this method:
conda
environment with conda
's and NumPy's MKL=2019.MKL_DEBUG_CPU_TYPE
= 5The commands for the above steps:
conda create -n my_env -c anaconda python numpy mkl=2019.* blas=*=*mkl
conda activate my_env
conda env config vars set MKL_DEBUG_CPU_TYPE=5
And thats it!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With