Say I am reading and writing <code>uint32_t</code> values to and from a stream. If I read/write one byte at a time to/from a stream and shift each byte like the below examples, will the results be consistent regardless of machine endianness? In the examples here the stream is a buffer in memory called <code>p</code>. <pre class="prettyprint lang-cpp prettyprint-override"><code>static uint32_t s_read_uint32(uint8_t** p) { uint32_t value; value = (*p)[0]; value |= (((uint32_t)((*p)[1])) << 8); value |= (((uint32_t)((*p)[2])) << 16); value |= (((uint32_t)((*p)[3])) << 24); *p += 4; return value; } static void s_write_uint32(uint8_t** p, uint32_t value) { (*p)[0] = value & 0xFF; (*p)[1] = (value >> 8 ) & 0xFF; (*p)[2] = (value >> 16) & 0xFF; (*p)[3] = value >> 24; *p += 4; } </code></pre> I don't currently have access to a big-endian machine to test this out, but the idea is if each byte is written one at a time each individual byte can be independently written or read from the stream. Then the CPU can handle endianness by hiding these details behind the shifting operations. Is this true, and if not could anyone please explain why not?

<blockquote> If I read/write one byte at a time to/from a stream and shift each byte like the below examples, will the results be consistent regardless of machine endianness? </blockquote> Yes. Your <code>s_write_uint32()</code> function stores the bytes of the input value in order from least significant to most significant, regardless of their order in the native representation of that value. Your <code>s_read_uint32()</code> correctly reverses this process, regardless of the underlying representation of <code>uint32_t</code>. These work because <ul> <li>the behavior of the shift operators (<code><<</code>, <code>>></code>) is defined in terms of the value of the left operand, not its representation</li> <li>the <code>& 0xff</code> masks off all bits of the left operand but those of its least-significant byte, regardless of the value's representation (because <code>0xff</code> has a matching representation), and</li> <li>the <code>|=</code> operations just put the bytes into the result; the positions are selected, appropriately, by the preceding left shift. This might be more clear if <code>+=</code> were used instead, but the result would be no different.</li> </ul> Note, however, that to some extent, you are reinventing the wheel. POSIX defines a function pair <code>htonl()</code> and <code>nothl()</code> -- supported also on many non-POSIX systems -- for dealing with byte-order issues in four-byte numbers. The idea is that when sending, everyone uses <code>htonl()</code> to convert from host byte order (whatever that is) to network byte order (big endian) and sends the resulting four-byte buffer. On receipt, everyone accepts four bytes into one number, then uses <code>ntohl()</code> to convert from network to host byte order.

It'll work but a <code>memcpy</code> followed by a conditional byteswap will give you much better codegen for the write function. <pre class="prettyprint"><code>#include <stdint.h> #include <string.h> #define LE (((char*)&(uint_least32_t){1})[0]) // little endian ? void byteswap(char*,size_t); uint32_t s2_read_uint32(uint8_t** p) { uint32_t value; memcpy(&value,*p,sizeof(value)); if(!LE) byteswap(&value,4); return *p+=4, value; } void s2_write_uint32(uint8_t** p, uint32_t value) { memcpy(*p,&value,sizeof(value)); if(!LE) byteswap(*p,4); *p+=4; } </code></pre> Gcc since the 8th series (but not clang) can eliminate this shifts on a little-endian platforms, but you should help it by <code>restrict</code>-qualifying the doubly-indirect pointer to the destination, or else it might think that a write to <code>(*p)[0]</code> can invalidate <code>*p</code> (<code>uint8_t</code> is a char type and therefore permitted to alias anything). <pre class="prettyprint"><code>void s_write_uint32(uint8_t** restrict p, uint32_t value) { (*p)[0] = value & 0xFF; (*p)[1] = (value >> 8 ) & 0xFF; (*p)[2] = (value >> 16) & 0xFF; (*p)[3] = value >> 24; *p += 4; } </code></pre>

Is reading one byte at a time endianness agnostic regardless of value size?

Tags:

c

endianness

Say I am reading and writing uint32_t values to and from a stream. If I read/write one byte at a time to/from a stream and shift each byte like the below examples, will the results be consistent regardless of machine endianness?

In the examples here the stream is a buffer in memory called p.

static uint32_t s_read_uint32(uint8_t** p)
{
    uint32_t value;
    value  = (*p)[0];
    value |= (((uint32_t)((*p)[1])) << 8);
    value |= (((uint32_t)((*p)[2])) << 16);
    value |= (((uint32_t)((*p)[3])) << 24);
    *p += 4;
    return value;
}

static void s_write_uint32(uint8_t** p, uint32_t value)
{
    (*p)[0] = value & 0xFF;
    (*p)[1] = (value >> 8 ) & 0xFF;
    (*p)[2] = (value >> 16) & 0xFF;
    (*p)[3] = value >> 24;
    *p += 4;
}

I don't currently have access to a big-endian machine to test this out, but the idea is if each byte is written one at a time each individual byte can be independently written or read from the stream. Then the CPU can handle endianness by hiding these details behind the shifting operations. Is this true, and if not could anyone please explain why not?

968

asked May 30 '19 20:05

Cecil

2 Answers

If I read/write one byte at a time to/from a stream and shift each byte like the below examples, will the results be consistent regardless of machine endianness?

Yes. Your s_write_uint32() function stores the bytes of the input value in order from least significant to most significant, regardless of their order in the native representation of that value. Your s_read_uint32() correctly reverses this process, regardless of the underlying representation of uint32_t. These work because

the behavior of the shift operators (<<, >>) is defined in terms of the value of the left operand, not its representation
the & 0xff masks off all bits of the left operand but those of its least-significant byte, regardless of the value's representation (because 0xff has a matching representation), and
the |= operations just put the bytes into the result; the positions are selected, appropriately, by the preceding left shift. This might be more clear if += were used instead, but the result would be no different.

Note, however, that to some extent, you are reinventing the wheel. POSIX defines a function pair htonl() and nothl() -- supported also on many non-POSIX systems -- for dealing with byte-order issues in four-byte numbers. The idea is that when sending, everyone uses htonl() to convert from host byte order (whatever that is) to network byte order (big endian) and sends the resulting four-byte buffer. On receipt, everyone accepts four bytes into one number, then uses ntohl() to convert from network to host byte order.

196

answered Nov 11 '22 07:11

John Bollinger

It'll work but a memcpy followed by a conditional byteswap will give you much better codegen for the write function.

#include <stdint.h>
#include <string.h>

#define LE (((char*)&(uint_least32_t){1})[0]) // little endian ? 
void byteswap(char*,size_t);

uint32_t s2_read_uint32(uint8_t** p)
{
    uint32_t value;
    memcpy(&value,*p,sizeof(value));
    if(!LE) byteswap(&value,4);
    return *p+=4, value;
}

 void s2_write_uint32(uint8_t** p, uint32_t value)
{
    memcpy(*p,&value,sizeof(value));
    if(!LE) byteswap(*p,4);
    *p+=4;
}

Gcc since the 8th series (but not clang) can eliminate this shifts on a little-endian platforms, but you should help it by restrict-qualifying the doubly-indirect pointer to the destination, or else it might think that a write to (*p)[0] can invalidate *p (uint8_t is a char type and therefore permitted to alias anything).

void s_write_uint32(uint8_t** restrict p, uint32_t value)
{
    (*p)[0] = value & 0xFF;
    (*p)[1] = (value >> 8 ) & 0xFF;
    (*p)[2] = (value >> 16) & 0xFF;
    (*p)[3] = value >> 24;
    *p += 4;
}

answered Nov 11 '22 07:11

PSkocik

Related questions
                            
                                What does exit code 6 generally mean in c?
                            
                                Cast T[][] to T*
                            
                                Trying to suppress clang false positive leak warning
                            
                                What is __aeabi_unwind_cpp_pr1' and how can I avoid it?
                            
                                Android VpnService protect socket that's stored in native code?
                            
                                How to concatenate two vector efficiently using AVX2? (a lane-crossing version of VPALIGNR)
                            
                                How to overflow a float?
                            
                                Why does alarm() cause fgets() to stop waiting?
                            
                                Aliasing struct and array the conformant way
                            
                                relocation R_X86_64_32 against `.data' can not be used when making a shared object;
                            
                                C initialize pointer to array literal without extra variable
                            
                                C\C++ in VS Code with Linux Subsystem For Windows
                            
                                Python source code for math exponent function?
                            
                                How to determine pointer size preprocessor C
                            
                                What does the __attribute__((force)) do?
                            
                                Can it cause problems to pass the address to an array instead of the array?
                            
                                Separating multiple first and/or last names in C
                            
                                Fortran vs C: Mandelbrot benchmark
                            
                                What does `*p+++c&63` mean in C++?
                            
                                Simple way to write uncompressed JPEG or PNG image?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With