Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why bit endianness is an issue in bitfields?

Any portable code that uses bitfields seems to distinguish between little- and big-endian platforms. See the declaration of struct iphdr in linux kernel for an example of such code. I fail to understand why bit endianness is an issue at all.

As far as I understand, bitfields are purely compiler constructs, used to facilitate bit level manipulations.

For instance, consider the following bitfield:

 struct ParsedInt {     unsigned int f1:1;     unsigned int f2:3;     unsigned int f3:4; }; uint8_t i; struct ParsedInt *d = &i; 
Here, writing d->f2 is simply a compact and readable way of saying (i>>1) & (1<<4 - 1).

However, bit operations are well-defined and work regardless of the architecture. So, how come bitfields are not portable?

like image 353
Leonid99 Avatar asked May 18 '11 10:05

Leonid99


People also ask

Does endianness affect bit order?

Bit order usually follows the same endianness as the byte order for a given computer system. That is, in a big endian system the most significant bit is stored at the lowest bit address; in a little endian system, the least significant bit is stored at the lowest bit address.

Does endianness matter for a single byte?

Again, endian-ness does not matter if you have a single byte. If you have one byte, it's the only data you read so there's only one way to interpret it (again, because computers agree on what a byte is).

How do you know if you have endianness?

If it is little-endian, it would be stored as “01 00 00 00”. The program checks the first byte by dereferencing the cptr pointer. If it equals to 0, it means the processor is big-endian(“00 00 00 01”), If it equals to 1, it means the processor is little-endian (“01 00 00 00”).

How bit fields are stored in memory?

Again, storage of bit fields in memory is done with a byte-by-byte, rather than bit-by-bit, transfer.


1 Answers

By the C standard, the compiler is free to store the bit field pretty much in any random way it wants. You can never make any assumptions of where the bits are allocated. Here are just a few bit-field related things that are not specified by the C standard:

Unspecified behavior

  • The alignment of the addressable storage unit allocated to hold a bit-field (6.7.2.1).

Implementation-defined behavior

  • Whether a bit-field can straddle a storage-unit boundary (6.7.2.1).
  • The order of allocation of bit-fields within a unit (6.7.2.1).

Big/little endian is of course also implementation-defined. This means that your struct could be allocated in the following ways (assuming 16 bit ints):

PADDING : 8 f1 : 1 f2 : 3 f3 : 4  or  PADDING : 8 f3 : 4 f2 : 3 f1 : 1  or  f1 : 1 f2 : 3 f3 : 4 PADDING : 8  or  f3 : 4 f2 : 3 f1 : 1 PADDING : 8 

Which one applies? Take a guess, or read in-depth backend documentation of your compiler. Add the complexity of 32-bit integers, in big- or little endian, to this. Then add the fact that the compiler is allowed to add any number of padding bytes anywhere inside your bit field, because it is treated as a struct (it can't add padding at the very beginning of the struct, but everywhere else).

And then I haven't even mentioned what happens if you use plain "int" as bit-field type = implementation-defined behavior, or if you use any other type than (unsigned) int = implementation-defined behavior.

So to answer the question, there is no such thing as portable bit-field code, because the C standard is extremely vague with how bit fields should be implemented. The only thing bit-fields can be trusted with is to be chunks of boolean values, where the programmer isn't concerned of the location of the bits in memory.

The only portable solution is to use the bit-wise operators instead of bit fields. The generated machine code will be exactly the same, but deterministic. Bit-wise operators are 100% portable on any C compiler for any system.

like image 148
Lundin Avatar answered Sep 19 '22 08:09

Lundin