Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fortran vs C: Mandelbrot benchmark

I stumbled across the Benchmark Game (code page) and compared Fortran and C. I was very surprised about the difference in the calculation time on the Mandelbrot test (Fortran is 4.3 times slower!) because both languages have very similar feature sets. Moreover, Fortran should be able to optimize more radical (see e.g. "Is Fortran easier to optimize than C for heavy calculations?").

Can one explain which feature is missing in Fortran which would be needed to gain a speed such as in the C example? (It seems that the bit-operations here are boosting the code.)

EDIT: It is not a question on which programing language is better (there are always many aspects which play a role). It is rather a fundamental question on the difference of optimization in this example.


Add-on to the answer by Peter Cordes: There is a paper on Basics of Vectorization for Fortran Applications which also shortly discusses SIMD in Fortran programming. For Intel compilers: Explicit Vector Programming in Fortran

like image 223
pawel_winzig Avatar asked Jan 20 '19 05:01

pawel_winzig


1 Answers

The winning C++ version on that benchmark site is manually vectorized for x86, using SIMD intrinsics (SSE, AVX, or AVX512), e.g. using _mm256_movemask_pd(v1 <= v2); to get a bitmask of a whole vector of compare results, letting it check 4 pixels in parallel for going out of bounds. And GNU C native vector syntax for SIMD multiply and whatever, like r2 + i2 to add or multiply SIMD vectors with normal C / C++ operators.

The C++ version has a loop condition that's optimized for SIMD:

 // Do 50 iterations of mandelbrot calculation for a vector of eight
 // complex values.  Check occasionally to see if the iterated results
 // have wandered beyond the point of no return (> 4.0).

The Fortran is merely using OpenMP for auto-parallelization, and auto-vectorization by the compiler isn't going to create anything nearly as good as a hand-tuned loop condition that keeps doing redundant work the source didn't (because that's cheaper than checking more frequently).


There are lots of C and C++ versions of the program that are a similar speed to the Fortran version. They're pretty even for C/C++ source that isn't manually vectorized.

I'm not sure if Intel Fortran or any other compiler supports extensions for manual vectorization.

like image 184
Peter Cordes Avatar answered Oct 07 '22 22:10

Peter Cordes