Are there any preprocessor directives that control loop unrolling?

2 Answers

For MSVC there is only a vector independence hint: http://msdn.microsoft.com/en-us/library/hh923901.aspx

#pragma loop( ivdep )

For many other compilers, like Intel/ibm, there a several pragma hints for optimizing a loop:

#pragma unroll
#pragma loop count N
#pragma ivdep

There is a thread with MSVC++ people about unroll heuristic: http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/d0b225c2-f5b0-4bb9-ac6a-4d4f61f7cb17/

VC tries to balance execution speed and code size. You can change the balance by using flags /O1 or /O2, but even when optimzing for speed VC tries to conserve code size as well.

Basically, unroll will increase code size, so it may be limited in Os and O1 modes (modes table)

PS: Pragma looks like preprocessor directive, but it is not. It is a directive for compiler and it it ignored (kept) by preprocessor.

154

answered Oct 21 '22 01:10

osgx

In the case of Intel Compiler:

#pragma loop count N helps the compiler to use the best strategy in order to vectorize the loop. It saves time So, we can say it helps to drive the loop unrolling. Examples:

#pragma loop_count min(n),max(n),avg(n)

#pragma unroll (n) works only when used with -O3 flag, you can use the following strategy to unroll your loop according to target processor.

Besides the increased code generated by loop unrolling, it may worth, since the compiler will produce loop's version for scalar operations as well for vector operations.

In cases where unrolling is affecting performance, for instance: loop with 20 iterations with vector length 16, results in 1 loop that executes 16 operations at once and a remainder loop that executes 4 sequentially. To avoid remainder loop generated by the compiler we can use before the loop:

#pragma vector novecremainder //or -mP2OPT_hpo_vec_peel = F to disable peel and remainder loops (compiler internal option)

#pragma nounroll //where unrolling is not worth at all

Just to clarify the #pragma ivdep :

It gives specific hints to modify compiler heuristics about dependencies and must be used only when we know that the assumed dependencies are safe to ignore.
Most important, it overrides potential dependencies, but the compiler still performs a dependence analysis, try #pragma simd to vectorize regardless any analysis.

Hope this helps.

answered Oct 21 '22 00:10

Igor Freitas

Related questions
                            
                                What is the difference between Java's equals() and C++'s operator ==?
                            
                                What are steps a simple http C++ server should perform to let user login via OpenID authentication?
                            
                                Is cout guaranteed available during static deinitialization?
                            
                                Strange behavior when static casting from a big double to an integer
                            
                                C++ copy a stream object
                            
                                Returning abstract type in base class
                            
                                "template" keyword not needed? [gcc/clang/Comeau bug?]
                            
                                Where is cout declared?
                            
                                Correct use of string storage in C and C++
                            
                                G++ Compiler Error or faulty code? : "template definition of non-template"
                            
                                Looking for a low impact c++ profiler
                            
                                How to use functions from different C++ projects in Visual Studio 2010?
                            
                                boost::format and custom printing a std containers
                            
                                Can string-based user-defined literals be strongly typed?
                            
                                How to scale to resolution in SDL?
                            
                                Why are redundant class name qualifiers allowed?
                            
                                Efficiently multiply (n-1) elements of an array [duplicate]
                            
                                g++: std::function initialized with closure type always uses heap allocation?
                            
                                C++ sizeof with bool
                            
                                Why doesn't shared_ptr<A> implicit convert to shared_ptr<A const>?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Are there any preprocessor directives that control loop unrolling?

Tags:

c++

preprocessor-directive

visual-c++

pragma

Steve Barna

People also ask

2 Answers

osgx

Igor Freitas

Recent Activity

Donate For Us