Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preferred idiom for endianess-agnostic reads

In the Plan 9 source code I often find code like this to read serialised data from a buffer with a well-defined endianess:

#include <stdint.h>

uint32_t le32read(uint8_t buf[static 4]) {
    return (buf[0] | buf[1] << 8 | buf[2] << 16 | buf[3] << 24);
}

I expected both gcc and clang to compile this code into something as simple as this assembly on amd64:

    .global le32read
    .type le32read,@function
le32read:
    mov (%rdi),%eax
    ret
    .size le32read,.-le32read

But contrary to my expectations, neither gcc nor clang recognize this pattern and produce complex assembly with multiple shifts instead.

Is there an idiom for this kind of operation that is both portable to all C99-implementations and produces good (i.e. like the one presented above) code across implementations?

like image 897
fuz Avatar asked Aug 09 '14 14:08

fuz


2 Answers

After some research, I found (with the help of the terrific people in ##c on Freenode), that gcc 5.0 will implement optimizations for the kind of pattern described above. In fact, it compiles the C source listed in my question to the exact assembly I listed below.

I haven't found similar information about clang, so I filed a bug report. As of Clang 9.0, clang recognises both the read as well as the write idiom and turns it into fast code.

like image 160
fuz Avatar answered Nov 05 '22 21:11

fuz


If you want to guaranty a conversions between a native platform order and a defined order (order on a network for example) you can let system libraries to the work and simply use the functions of <netinet/in.h> : hton, htons, htonl and ntoh, ntohs, nthol.

But I must admit that the include file is not guaranteed : under Windows I think it is winsock.h.

like image 1
Serge Ballesta Avatar answered Nov 05 '22 19:11

Serge Ballesta