Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding behavior of old C++ code

I am migrating some parts of old C++ code, originally compiled with CodeGear C++Builder® 2009 Version 12.0.3170.16989

The following code - minimal version of a bigger piece - outputs -34 with any modern compiler. Although, in the original platform it outputs 84:

char Key[4];    
Key[0] = 0x1F;
Key[1] = 0x01;
Key[2] = 0x8B;
Key[3] = 0x55;

for(int i = 0; i < 2; i++) {
    Key[i] = Key[2*i] ^ Key[2*i + 1];
}

std::cout << (int) Key[1] << std::endl;

enter image description here The following code outputs -34 with both old and new compilers:

for(int i = 0; i < 2; i++) {
    char a = Key[2*i];
    char b = Key[2*i + 1];
    char c = a ^ b;
    Key[i] = c;
}

Also, manually unrolling the loop seems to work with both compilers:

Key[0] = Key[0] ^ Key[1];
Key[1] = Key[2] ^ Key[3];

It is important that I match the behavior of the old code. Can anyone please help me understand why the original compiler produces those results?

like image 823
Iban Cereijo Avatar asked Aug 13 '17 15:08

Iban Cereijo


1 Answers

This seems to be a bug:

The line

Key[i] = Key[2*i] ^ Key[2*i + 1];

generates the following code:

00401184 8B55F8           mov edx,[ebp-$08]
00401187 8A4C55FD         mov cl,[ebp+edx*2-$03]
0040118B 8B5DF8           mov ebx,[ebp-$08]
0040118E 304C1DFC         xor [ebp+ebx-$04],cl

That does not make sense. This is something like:

Key[i] ^= Key[i*2 + 1];

And that explains how the result came to be: 0x01 ^ 0x55 is indeed 0x54, or 84.

It should be something like:

mov edx,[ebp-$08]
mov cl,[ebp+edx*2-$04]
xor cl,[ebp+edx*2-$03]
mov [ebp+ebx-$04],cl

So this is definitely a code generation bug. It seems to persist until now, C++Builder 10.2 Tokyo, for the "classic" (Borland) compiler.

But if I use the "new" (clang) compiler, it produces 222. The code produced is:

File7.cpp.12: Key[i] = Key[2*i] ^ Key[2*i + 1];
004013F5 8B45EC           mov eax,[ebp-$14]
004013F8 C1E001           shl eax,$01
004013FB 0FB64405F0       movzx eax,[ebp+eax-$10]
00401400 8B4DEC           mov ecx,[ebp-$14]
00401403 C1E101           shl ecx,$01
00401406 0FB64C0DF1       movzx ecx,[ebp+ecx-$0f]
0040140B 31C8             xor eax,ecx
0040140D 88C2             mov dl,al
0040140F 8B45EC           mov eax,[ebp-$14]
00401412 885405F0         mov [ebp+eax-$10],dl

That doesn't look optimal to me (I used O2 and O3 with the same result), but it produces the right result.

like image 152
Rudy Velthuis Avatar answered Oct 01 '22 12:10

Rudy Velthuis