Considering a pared-down example of down-casting unsigned
to unsigned char
,
void unsigned_to_unsigned_char(unsigned *sp, unsigned char *dp)
{
*dp = (unsigned char)*sp;
}
The above C code is translated to assembly code with gcc -Og -S
as
movl (%rdi), %eax
movb %al, (%rsi)
For what reason is the C-to-assembly translation not as below?
movb (%rdi), %al
movb %al, (%rsi)
Is it because this is incorrect, or because movl
is more conventional, or shorter in encoding, than is movb
?
Writing to an 8 bit x86 register possibly incurs an extra merge µop when the new low byte is merged with the old high bytes of the corresponding 32/64 bit register. This can also cause an unexpected data dependency on the previous value of the register.
For this reason, it is generally a good idea to only write to 32/64 bit variants of general purpose registers on x86.
The cast in your question is wholly unnecessary as the language will effectively perform that cast before the assignment anyway, and so it contributes nothing to the generated code (remove it and see no changes, no errors or warnings).
The right hand side deference is of type unsigned int
so, that's what it done. Given a 32-bit bus, there's no performance penalty for doing a word dereference (modulo alignment issues).
If you wanted other, you can cast before the dereference, as follows:
void unsigned_to_unsigned_char(unsigned *sp, unsigned char *dp)
{
*dp = *(unsigned char *)sp;
}
This will produce the byte move instructions you're expecting.
https://godbolt.org/z/57nzrsrMe
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With