Tips and tricks on improving Fortran code performance [closed]

Tags:

As part of my Ph.D. research, I am working on development of numerical models of atmosphere and ocean circulation. These involve numerically solving systems of PDE's on the order of ~10^6 grid points, over ~10^4 time steps. Thus, a typical model simulation takes hours to a few days to complete when run in MPI on dozens of CPUs. Naturally, improving model efficiency as much as possible is important, while making sure the results are byte-to-byte identical.

While I feel quite comfortable with my Fortran programming, and am aware of quite some tricks to make code more efficient, I feel like there is still space to improve, and tricks that I am not aware of.

Currently, I make sure I use as few divisions as possible, and try not to use literal constants (I was taught to do this from very early on, e.g. use half=0.5 instead of 0.5 in actual computations), use as few transcendental functions as possible etc.

What other performance sensitive factors are there? At the moment, I am wondering about a few:

1) Does the order of mathematical operations matter? For example if I have:

a=1E-7 ; b=2E4 ; c=3E13
d=a*b*c

would d evaluate with different efficiency based on the order of multiplication? Nowadays, this must be compiler specific, but is there a straight answer? I notice d getting (slightly) different value based on the order (precision limit), but will this impact the efficiency or not?

2) Passing lots (e.g. dozens) of arrays as arguments to a subroutine versus accessing these arrays from a module within the subroutine?

3) Fortran 95 constructs (FORALL and WHERE) versus DO and IF? I know that these mattered back in the 90's when code vectorization was a big thing, but is there any difference now with modern compilers being able to vectorize explicit DO loops? (I am using PGI, Intel, and IBM compilers in my work)

4) Raising a number to an integer power versus multiplication? E.g.:

b=a**4

b=a*a*a*a

I have been taught to always use the latter where possible. Does this affect efficiency and/or precision? (probably compiler dependent as well)

Please discuss and/or add any tricks and tips that you know about improving Fortran code efficiency. What else is out there? If you know anything specific to what each of the compilers above do related to this question, please include that as well.

Added: Note that I do not have any bottlenecks or performance issues per se. I am asking if there are any general rules for optimizing the code in sense of operations.

Thanks!

375

asked Oct 15 '11 18:10

milancurcic

2 Answers

You've got a-priori ideas about what to do, and some of them might actually help, but the biggest payoff is in a-posteriori anaylsis.
(Added: In other words, getting a*b*c into a different order might save a couple cycles (which I doubt), while at the same time you don't know you're not getting blind-sided by something spending 1000 cycles for no good reason.)

No matter how carefully you code it, there will be opportunities for speedup that you didn't foresee. Here's how I find them. (Some people consider this method controversial).

It's best to start with optimization flags OFF when you do this, so the code isn't all scrambled. Later you can turn them on and let the compiler do its thing.

Get it running under a debugger with enough of a workload so it runs for a reasonable length of time. While it's running, manually interrupt it, and take a good hard look at what it's doing and why. Do this several times, like 10, so you don't draw erroneous conclusions about what it's spending time at.

Here's examples of things you might find:

It could be spending a large fraction of time calling math library functions unnecessarily due to the way some expressions were coded, or with the same argument values as in prior calls.
It could be spending a large fraction of time doing some file I/O, or opening/closing a file, deep inside some routine that seemed harmless to call.
It could be in a general-purpose library function, calling a subordinate subroutine, for the purpose of checking argument flags to the upper function. In such a case, much of that time might be eliminated by writing a special-purpose function and calling that instead.

If you do this entire operation two or three times, you will have removed the stupid stuff that finds its way into any software when it's first written. After that, you can turn on the optimization, parallelism, or whatever, and be confident no time is being spent on silly stuff.

131

answered Sep 22 '22 14:09

Mike Dunlavey

I second the advice that these tricks that you have been taught are silly in this era. Compilers do this for you now; such micro-optimizations are unlikely to make a significant difference and may not be portable. Write clear & understandable code. Carefully select your algorithm. One thing that can make a difference is using indices of multi-dimensions arrays in the correct order ... recasting an M X N array to N X M can help depending on the pattern of data access by your program. After this, if your program is too slow, measure where the CPU is consumed and improve only those parts. Experience shows that guessing is frequently wrong and leads to writing more opaque code for nor reason. If you make a code section in which your program spends 1% of its time twice as fast, it won't make any difference.

Here are previous answers on FORALL and WHERE: How can I ensure that my Fortran FORALL construct is being parallelized? and Do Fortran 95 constructs such as WHERE, FORALL and SPREAD generally result in faster parallel code?

answered Sep 21 '22 14:09

M. S. B.

Related questions
                            
                                Optimizing a search algorithm in C
                            
                                Speed of IN keyword in MySQL/PostgreSQL
                            
                                Insertion of data after creating index on empty table or creating unique index after inserting data on oracle?
                            
                                Is it possible to optimize ASP.NET WebForms to perform as fast as ASP.NET MVC?
                            
                                What is good practice for null reference checks? [duplicate]
                            
                                Arithmetic mean on a multidimensional array on R and MATLAB: drastic difference of performances
                            
                                Unknown events in nodejs/v8 flamegraph using perf_events
                            
                                Efficiently Check Multiple Conditions [closed]
                            
                                MySQL performance DELETE or UPDATE?
                            
                                The timeout period elapsed prior to obtaining a connection from the pool
                            
                                Java: how much time does an empty loop use?
                            
                                double vs long serialization in java
                            
                                loop tiling. how to choose block size?
                            
                                Find each element that is less than some element to its right
                            
                                IEnumerable.Count() or ToList().Count
                            
                                C# How to split a List in two using LINQ [duplicate]
                            
                                Java 'Prototype' pattern - new vs clone vs class.newInstance
                            
                                TDD with large C# solution virtually impossible due to slow compile speed
                            
                                What's the most efficient document-oriented database engine to store thousands of medium sized documents?
                            
                                count the number of distinct absolute values among the elements of the array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tips and tricks on improving Fortran code performance [closed]

Tags:

performance

fortran

hpc

milancurcic

People also ask

2 Answers

Mike Dunlavey

M. S. B.

Recent Activity

Donate For Us