I have implemented (read: copy&pasted from wiki) the XXTEA cipher in a C++ project. For clarity reasons I separated the encryption and decryption in separate functions: (NOTE: this is not a cryptography question! Please do not comment on the chosen cipher)
#define DELTA 0x9e3779b9
#define MX (((z>>5^y<<2) + (y>>3^z<<4)) ^ ((sum^y) + (key[(p&3)^e] ^ z)))
static void btea_enc( unsigned int *v, unsigned n, const unsigned int* key ) {
unsigned int y, z, sum;
unsigned p, rounds, e;
rounds = 16 + 52/n;
sum = 0;
z = v[n-1];
do {
sum += DELTA;
e = (sum >> 2) & 3;
for (p=0; p<n-1; p++) {
y = v[p+1];
z = v[p] += MX;
}
y = v[0];
z = v[n-1] += MX;
} while (--rounds);
}
static void btea_dec( unsigned int *v, unsigned n, const unsigned int* key ) {
unsigned int y, z, sum;
unsigned p, rounds, e;
rounds = 16 + 52/n;
sum = rounds*DELTA;
y = v[0];
do {
e = (sum >> 2) & 3;
for (p=n-1; p>0; p--) {
z = v[p-1];
y = v[p] -= MX;
}
z = v[n-1];
y = v[0] -= MX;
} while ((sum -= DELTA) != 0);
}
#undef MX
#undef DELTA
When this code is compiled in Debug, it works perfectly. However when I compile this code with (default) optimizations with Visual Studio 2013 (v120), btea_dec loses its outer loop (causing the decryption to produce garbage).
The disassembly listing for encryption and decryption. Notice the missing outer loop during decryption! (if you'd like the code as text, I'd be happy to upload, it's just a wall of text)
Looking at the actual code, the end condition is an overflowing unsigned int 'sum':while ((sum -= DELTA) != 0)
I don't understand what the compiler did to make it think it could get rid of this loop (afaik overflow is only undefined for integers, unsigned overflow is perfectly fine).
Question: Why is the compiler 'optimizing' the outer loop away? And how do I fix it?
MCVE: (paste the previous code block containing btea_enc and btea_dec between include & main)
#define _CRT_RAND_S
#include <cstdlib>
int main(int argc, char* argv[])
{
// Random key
unsigned int key[4];
rand_s(&key[0]);
rand_s(&key[1]);
rand_s(&key[2]);
rand_s(&key[3]);
// Buffer we'll be encrypting
unsigned int utext[4];
memcpy(utext, "SecretPlaintext", 16);
// Encrypt
btea_enc(utext, 4, key);
// Decrypt
btea_dec(utext, 4, key);
// Should still be equal!
bool s = !strcmp((char*)utext, "SecretPlaintext");
// Print message
printf("Compared: %s\n", s ? "equal" : "falsly");
return s?0:1;
}
WIth /GL, the compiler knows that n == 4
and therefore rounds == 29
. It's surely precomputing the initial value of sum
, which is rounds*DELTA
, as well.
Next, it may try to calculate the number of loop iterations and unroll the outer loop. If it's doing that wrongly (as I did in my other answer), it may be doing uint32_t(rounds * DELTA) / DELTA
, which is one. Add the first iteration for being a do-while, and that's where the outer loop went.
gnasher's loop control code is much easier for a compiler to figure out, there are exactly rounds
(29) iterations, which it might or might not decide to unroll, but there's very little room to mess up the number of iterations.
Why this happens is beyond me. You could try replacing
} while ((sum -= DELTA) != 0);
with
sum -= DELTA;
} while ((--rounds) != 0);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With