Are BLAS Level 1 procedures still relevant for modern fortran compilers?

Question

Most of the BLAS Level 1 API can be trivially written straight forward using Fortran 9x+ vectorized assignments and intrinsic procedures.

Assuming you are using a modern optimizing compiler, like Intel Fortran, and correct target-specific compiler optimization options, are there any performance benefits from using BLAS Level 1 procedures instead, say from Intel MKL or other fast BLAS implementations?

If there are, what is a typical vector size when these benefits appear?

tpg2114 · Accepted Answer

It depends. We've tested this before with the Intel compiler and run into surprising results. For example, DOT_PRODUCT from Fortran vs. the BLAS implementation gave different trends based on the problem size. As the number of elements in the arrays got larger, BLAS became better than the intrinsic. But for small problem sizes, the intrinsic was much faster.

We actually measured for our use cases what the cut-off size that's required to make one better than the other and actually use if-statements to decide which to call. I can't share those results, but I encourage you to test it out yourself. There is still benefit from using BLAS.

Are BLAS Level 1 procedures still relevant for modern fortran compilers?

Tags:

fortran

blas

abbot

1 Answers

tpg2114

Recent Activity

Donate For Us

Are BLAS Level 1 procedures still relevant for modern fortran compilers?

Tags:

fortran

blas

abbot

1 Answers

tpg2114

Related questions

Recent Activity

Donate For Us