In the Plan 9 source code I often find code like this to read serialised data from a buffer with a well-defined endianess:
#include <stdint.h>
uint32_t le32read(uint8_t buf[static 4]) {
return (buf[0] | buf[1] << 8 | buf[2] << 16 | buf[3] << 24);
}
I expected both gcc and clang to compile this code into something as simple as this assembly on amd64:
.global le32read
.type le32read,@function
le32read:
mov (%rdi),%eax
ret
.size le32read,.-le32read
But contrary to my expectations, neither gcc nor clang recognize this pattern and produce complex assembly with multiple shifts instead.
Is there an idiom for this kind of operation that is both portable to all C99-implementations and produces good (i.e. like the one presented above) code across implementations?
After some research, I found (with the help of the terrific people in ##c on Freenode), that gcc 5.0 will implement optimizations for the kind of pattern described above. In fact, it compiles the C source listed in my question to the exact assembly I listed below.
I haven't found similar information about clang, so I filed a bug report. As of Clang 9.0, clang recognises both the read as well as the write idiom and turns it into fast code.
If you want to guaranty a conversions between a native platform order and a defined order (order on a network for example) you can let system libraries to the work and simply use the functions of <netinet/in.h>
: hton, htons, htonl and ntoh, ntohs, nthol.
But I must admit that the include file is not guaranteed : under Windows I think it is winsock.h
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With