Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct way of unrolling loop using gcc

#include <stdio.h>
int main() {
        int i;
        for(i=0;i<10000;i++){
            printf("%d",i);
    }
}

I want to do loop unrolling on this code using gcc but even using the flag.

gcc -O2 -funroll-all-loops --save-temps unroll.c

the assembled code i am getting contain a loop of 10000 iteration

_main:
Leh_func_begin1:
        pushq   %rbp
Ltmp0:
movq    %rsp, %rbp
Ltmp1:
pushq   %r14
pushq   %rbx
Ltmp2:
xorl    %ebx, %ebx
leaq    L_.str(%rip), %r14
.align  4, 0x90
LBB1_1:
xorb    %al, %al
movq    %r14, %rdi
movl    %ebx, %esi
callq   _printf
incl    %ebx
cmpl    $10000, %ebx
jne LBB1_1
popq    %rbx
popq    %r14
popq    %rbp
ret
Leh_func_end1:

Can somone plz tell me how to implement loop unrolling correctly in gcc

like image 698
Neel Choudhury Avatar asked Sep 30 '13 21:09

Neel Choudhury


People also ask

How do you unroll a loop?

A loop can be unrolled by replicating the loop body a number of times and then changing the termination logic to comprehend the multiple iterations of the loop body (Figure 6.22). The loops in Figures 6.22a and 6.22b each take four cycles to execute, but the loop in Figure 6.22b is doing four times as much work!

What is pragma unroll?

The UNROLL pragma specifies to the compiler how many times a loop should be unrolled. The UNROLL pragma is useful for helping the compiler utilize SIMD instructions. It is also useful in cases where better utilization of software pipeline resources are needed over a non-unrolled loop.

How does loop unrolling improve compiler static rescheduling of code?

Unrolling simply replicates the loop body multiple times, adjusting the loop termination code. Loop unrolling can also be used to improve scheduling. Because it eliminates the branch, it allows instructions from different iterations to be scheduled together.

What are Funroll loops?

With -funroll-loops the compiler heuristically decides which loops to unroll. If you want to force unrolling you can use -funroll-all-loops , but it usually makes the code run slower.


2 Answers

Loop unrolling won't give you any benefit for this code, because the overhead of the function call to printf() itself dominates the work done at each iteration. The compiler may be aware of this, and since it is being asked to optimize the code, it may decide that unrolling increases the code size for no appreciable run-time performance gain, and decides the risk of incurring an instruction cache miss is too high to perform the unrolling.

The type of unrolling required to speed up this loop would require reducing the number of calls to printf() itself. I am unaware of any optimizing compiler that is capable of doing that.

As an example of unrolling the loop to reduce the number of printf() calls, consider this code:

void print_loop_unrolled (int n) {
    int i = -8;
    if (n % 8) {
        printf("%.*s", n % 8, "01234567");
        i += n % 8;
    }
    while ((i += 8) < n) {
        printf("%d%d%d%d%d%d%d%d",i,i+1,i+2,i+3,i+4,i+5,i+6,i+7);
    }
}
like image 62
jxh Avatar answered Sep 20 '22 07:09

jxh


gcc has maximum loops unroll parameters.

You have to use -O3 -funroll-loops and play with parameters max-unroll-times, max-unrolled-insns and max-average-unrolled-insns.

Example:

-O3 -funroll-loops --param max-unroll-times=200
like image 33
ouah Avatar answered Sep 21 '22 07:09

ouah