Relation of endianness to assembly conversion of size in C

Question

Please note that the below is adapted from Problem 3.4 of Bryant and O'Hallaron's text (CSAPP3e). I have stripped away everything but my essential question.

Context: we are looking at a x86-64/Linux/gcc combo wherein ints are 4 bytes and chars are considered signed (and, of course, 1 byte). We are interested in writing the assembly corresponding to conversion of an int to a char which, at a high level, we know arises from performing truncation.

They present the following solution:

movl (%rdi), %eax            // Read 4 bytes
movb %al, (%rsi)             // Store low-order byte

My question is whether we can change the movl to a movb since, after all, we are only using a byte in the end. My concern with this suspicion is that there might be some endian-dependence with the read, and we might somehow be getting the high bits if our processor/OS is in little-endian mode. Is this suspicion correct, or would my change work no matter what?

I would try this out but 1) I am on a Mac with Apple silicon and 2) even if my suspicion worked, I couldn't be sure if this sort of thing was implementation-dependent.

zwol · Accepted Answer

You're right to be concerned about endianness for this kind of operation, but in this case, your alternative approach would fail on big-endian machines, not on little-endian ones.

x86 is little endian, which means the low-order eight bits of a 32-bit integer are stored in the first (lowest address) byte of that integer, so

movb (%rdi), %al     // Read low-order byte
movb %al, (%rsi)     // Store low-order byte

will do the truncation you want to do on x86. But on a big-endian machine the equivalent operation would read the highest eight bits of the 32-bit integer. The m68k architecture, for instance, is big-endian; a correct version of your alternative approach, for that architecture, would be

move.b 3(%a1), %d0   // Read low-order byte
move.b %d0, (%a0)    // Store low-order byte

Without the 3 it would read the high-order byte of the int pointed to by register %a1.

The virtue of doing it the way CS:APP does it is that the same construct will work correctly on both big- and little-endian architectures. Of course, if you're programming in assembly language you have to rewrite the code anyway to move the program to a different architecture, but it's one fewer thing to worry about while you're doing that.

Compiler-generated code will probably also do it the CS:APP way for related reasons: compilers usually do most of their work in an architecture independent "intermediate representation" and then translate that to assembly language. That translation is one of the most complex phases of an industrial grade compiler, for reasons beyond the scope of this answer; every simplifying assumption that doesn't make the generated code worse will be applied to make it easier to write.

Erik Eidt · Answer

There’s almost no difference between using movl and movb here.

If the address used to load is unaligned and falls on a page boundary, then movl could be slower than using movb.

On the other hand, if the source is potentially a common sub expression, then movl provides access to the full value and the truncated value, whereas the other only provides access to the truncated value.

It is hard to imagine how endianess comes into play here on x86 platforms. If for some reason you move to a big endian platform there will be different code — like the equivalent of movb al, 3(rsi) (the movl version will work unmodified.)

Relation of endianness to assembly conversion of size in C

Tags:

c

assembly

x86-64

endianness

EE18

2 Answers

zwol

Erik Eidt

Recent Activity

Donate For Us

Relation of endianness to assembly conversion of size in C

Tags:

c

assembly

x86-64

endianness

EE18

2 Answers

zwol

Erik Eidt

Related questions

Recent Activity

Donate For Us