Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding MIPS assembly code from C compiler

Tags:

I converted a C code to MIPS and I could not understand a part of the MIPS instructions:

#include <inttypes.h>
#include <stdint.h>

uint16_t
chksum(uint16_t sum, const uint8_t *data, uint16_t len)
{
    uint16_t t;
    const uint8_t *dataptr;
    const uint8_t *last_byte;

    dataptr = data;
    last_byte = data + len - 1;

    while (dataptr < last_byte)
    {
        t = (dataptr[0] << 8) + dataptr[1];
        sum += t;
        if (sum < t)
        {
            sum++;
        }
        dataptr += 2;
    }
    if (dataptr == last_byte)
    {
        t = (dataptr[0] << 8) + 0;
        sum += t;
        if (sum < t)
        {
            sum++;
        }
    }
    return sum;
}

I used MIPS gcc5.4 on the Godbolt compiler explorer with -O2 optimization, with the default -march of classic MIPS1 that doesn't have load interlocks:

chksum(unsigned short, unsigned char const*, unsigned short):
  andi $6,$6,0xffff
  addiu $6,$6,-1
  addu $6,$5,$6
  sltu $3,$5,$6
  beq $3,$0,$L2
  andi $2,$4,0xffff

  move $4,$5
$L4:
  lbu $3,0($4)
  lbu $7,1($4)
  sll $3,$3,8
  addu $3,$3,$7
  andi $3,$3,0xffff
  addu $2,$3,$2
  andi $2,$2,0xffff
  addiu $4,$4,2
  sltu $3,$2,$3
  sltu $7,$4,$6
  beq $3,$0,$L3
  addiu $8,$2,1

  andi $2,$8,0xffff
$L3:
  bne $7,$0,$L4
  nor $3,$0,$5

  addu $3,$3,$6
  srl $3,$3,1
  addiu $3,$3,1
  sll $3,$3,1
  addu $5,$5,$3
$L2:
  beq $6,$5,$L8
  nop

$L9:
  j $31
  nop

$L8:
  lbu $3,0($6)
  nop
  sll $3,$3,8
  addu $2,$3,$2
  andi $2,$2,0xffff
  sltu $3,$2,$3
  beq $3,$0,$L9
  nop

  addiu $2,$2,1
  j $31
  andi $2,$2,0xffff

I matched most of the instructions with the code but I could not understand the part in $L3 starting with nor instruction until the addu before $L2.

The Compiler Explorer shows that part is related to the while but I do not get why it manipulates the $5 before the branch in $L2.

like image 931
I know nothing Avatar asked May 12 '18 18:05

I know nothing


1 Answers

Let's analyze what the code is doing. A few mappings to make the code easy to follow:

Initial parameters:
    $4: sum  parameter
    $5: data parameter
    $6: len  parameter

Labels:
    $L4: while body
    $L3: while condition
    $L2: if condition

Registers:
    $2: sum
    $4: dataptr
    $6: last_byte

The relevant code:

    // [...]
    sltu $3,$5,$6     // $3 = $5 (data parameter) < $6 (last_byte) ? 1 : 0
    beq $3,$0,$L2     // if $3 == 0 goto $L2 (if condition)
    andi $2,$4,0xffff // $2 (sum) = $4 (sum parameter) & 0xFFFF
    move $4,$5        // $4 (dataptr) = $5 (data parameter)

$L4: // while body
    // [...]
    sltu $7,$4,$6     // $7 = $4 (dataptr) < $6 (last_byte) ? 1 : 0
    // [...]

$L3: // while condition
    bne $7,$0,$L4     // if $7 != 0 goto $L4 (while body) [1]

    nor $3,$0,$5      // $3 = $5 (data) nor 0

    addu $3,$3,$6     // $3 += $6 (last_byte)

    srl $3,$3,1       // $3 >>= 1
    addiu $3,$3,1     // $3++
    sll $3,$3,1       // $3 <<= 1

    addu $5,$5,$3     // $5 += $3

$L2: // if condition
  beq $6,$5,$L8       // if $6 (last_byte) == $5 goto $L8 [2]

The while loop finishes at [1]. The rest of the instructions until [2] compute a value into register $5 for comparison with $6 (last_byte), which is the if in the source code.

The question here is: what is the value in $5? If you put together all the operations, you get:

$5 = $5 + ((((($5 nor 0) + $6) >> 1) + 1) << 1)

Let's unravel this expression. First, realize that:

x NOR 0 = NOT(x OR 0) = ~(x | 0) = ~x

So it is just negating (one's complement) on $5.

Then, it adds $6, which is last_byte.

The next 3 operations (>> 1, + 1, << 1) are a way to compute the next even integer. See what happens with a few cases:

0000 (0) -> 0010 (2)
0001 (1) -> 0010 (2)
0010 (2) -> 0100 (4)
0011 (3) -> 0100 (4)
0100 (4) -> 0110 (6)
0101 (5) -> 0110 (6)
0110 (6) -> 1000 (8)
0111 (7) -> 1000 (8)

Finally, it is adding the original value of $5, which was the data parameter.

If you put all together, and replace with the names of the C variables for clarity, you arrive at:

$5 = data + next_even(~data + last_byte)

Recall that, for two's complement integers:

x - y == x + ~y + 1

Therefore:

$5 = data + next_even(last_byte - data - 1)
   = data + next_even(len - 2)

Now, computing the next even value after subtracting 2 is basically removing the lowest bit of information; in other words, a "floor" to the even numbers. This can be expressed as returning the same number if it is even, or one less if it is odd, i.e.:

$5 = data + (len % 2 ? len : len - 1)

Finally, the compiler compares this register to $6 (last_byte). Simplifying:

     last_byte == data + (len % 2 ? len : len - 1)
data + len - 1 == data + (len % 2 ? len : len - 1)
       len - 1 == len % 2 ? len : len - 1
       len % 2 != 0

Now we can also see that the expression actually only depends on len, and not on data.

The compiler, with all those instructions, is effectively re-computing dataptr from data and last_bytes. Indeed, if you consider that dataptr is only ever advanced from data in increments of 2, we can rewrite it as:

data + 2 * n_increments
data + 2 * (len / 2)
data + (len % 2 ? len : len - 1)

Which is precisely the value of $5 computed above.

Knowing this, one can wonder why the compiler arrived at this solution. The same happens for a recent version of GCC (8.1.0) and x86-64:

mov rdx, rsi
not rdx
add rdx, r8
shr rdx
lea rsi, [rsi+2+rdx*2]

It is clear that the optimizer is realizing that the final value of dataptr can be computed independently of the while loop -- however, it is unclear why it decides to do so instead of picking the value from the register. Maybe it has decided that avoiding the dependency on the result of the loop is faster (due to instruction pipelining) than otherwise.

like image 58
Acorn Avatar answered Sep 28 '22 17:09

Acorn