I am writing some code right now and I have a placeholder with matmul
that seems to be working pretty well, but I'd like to use a LAPACK
dgemm
implementation. I am only using gfortran
right now and getting very good speeds with matmul
, but I wonder if I can get better.
The current call is:
C = transpose(matmul( transpose(A), B))
where A
, B
, and C
are non-square, double precision
matrices. I can easily write a wrapper for dgemm
with the current gfortran
implementation of LAPACK
, but I like that I can do this all as a function (rather than worrying about call
for a surbroutine and having to deal with the transpose
).
I am wondering if I compile with ifort
and include the MKL
, will this matmul
magically change to a MKL
dgemm
function for me with no wrapper?
You don't want all MATMULs to be dgemm, it is not profitable for very small matrices.
Gfortran does what you want
-fexternal-blas This option will make gfortran generate calls to BLAS functions for some matrix operations like MATMUL, instead of using our own algorithms, if the size of the matrices involved is larger than a given limit (see -fblas-matmul-limit). This may be profitable if an optimized vendor BLAS library is available. The BLAS library will have to be specified at link time.
and you can even change the size limit for switching to BLAS by -fblas-matmul-limit=n
You can easily use MKL this way in gfortran.
Intel Fortran has something similar
[no - ] opt-matmul This option enables [disables] a compiler - generated Matrix Multiply (matmul) library call by identifying matrix multiplicat ion loop nests , if any , and replacing them with a matmul library call for improved performance. This option is enabled by default if options / O3 ( - O3) and /Qparallel ( - parallel) are specified. This option has no effect unless option / O2 ( - O2) or higher is set.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With