Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does .NET Native compile loop in reverse order?

I'm working on optimization techniques performed by the .NET Native compiler. I've created a sample loop:

        for (int i = 0; i < 100; i++)
        {
            Function();
        }

And I've compiled it with Native. Then I disassembled the result .dll file with machine code inside in IDA. As the result, I have:

IDA output

(I've removed a few unnecessary lines, so don't worry that address lines are inconsistent)

I understand that add esi, 0FFFFFFFFh means really subtract one from esi and alter Zero Flag if needed, so we can jump to the beginning if zero hasn't been reached yet.

What I don't understand is why did the compiler reverse the loop?

I came to the conclusion that

LOOP:
add esi, 0FFFFFFFFh
jnz LOOP

is just faster than for example

LOOP:
inc esi
cmp esi, 064h
jl LOOP

But is it really because of that and is the speed difference really significant?

like image 669
Kamil T Avatar asked Apr 05 '16 15:04

Kamil T


People also ask

Why reverse for loop is faster?

Backwards for loop is faster because the upper-bound (hehe, lower-bound) loop control variable does not have to be defined or fetched from an object; it is a constant zero. There is no real difference. Native loop constructs are always going to be very fast.

Can you do a for loop backwards?

Method #1 : Using reversed() The simplest way to perform this is to use the reversed function for the for loop and the iteration will start occurring from the rear side than the conventional counting.


Video Answer


2 Answers

inc might be slower than add because of the partial flag update. Moreover add affects the zero flag so you don't need to use another cmp instruction. Just jump directly.

This is one famous type of loop optimization

reversal: Loop reversal reverses the order in which values are assigned to the index variable. This is a subtle optimization which can help eliminate dependencies and thus enable other optimizations. Also, certain architectures utilize looping constructs at Assembly language level that count in a single direction only (e.g. decrement-jump-if-not-zero (DJNZ)).

  • Is it faster to count down than it is to count up?
  • GCC Loop optimization

You can see the result for other compilers here.

like image 168
phuclv Avatar answered Oct 17 '22 11:10

phuclv


Your conclusion is correct: inverted cycle will target 0 (cycle will ends when register value reach 0), so that Add will set zero flag used in conditional branch.

This way you don't need dedicated Cmp which leads to: 1) size optimization 2) it's also faster (conclusion from compiler programmers decision and another answer).

That's pretty common assembler trick to write loop targeting 0. I am surprised you understand assembler, but don't know (asking) about it.

like image 37
Sinatr Avatar answered Oct 17 '22 11:10

Sinatr