Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to enforce specific endianness for a C or C++ struct?

I've seen a few questions and answers regarding to the endianness of structs, but they were about detecting the endianness of a system, or converting data between the two different endianness.

What I would like to now, however, if there is a way to enforce specific endianness of a given struct. Are there some good compiler directives or other simple solutions besides rewriting the whole thing out of a lot of macros manipulating on bitfields?

A general solution would be nice, but I would be happy with a specific gcc solution as well.

Edit:

Thank you for all the comments pointing out why it's not a good idea to enforce endianness, but in my case that's exactly what I need.

A large amount of data is generated by a specific processor (which will never ever change, it's an embedded system with a custom hardware), and it has to be read by a program (which I am working on) running on an unknown processor. Byte-wise evaluation of the data would be horribly troublesome because it consists of hundreds of different types of structs, which are huge, and deep: most of them have many layers of other huge structs inside.

Changing the software for the embedded processor is out of the question. The source is available, this is why I intend to use the structs from that system instead of starting from scratch and evaluating all the data byte-wise.

This is why I need to tell the compiler which endianness it should use, it doesn't matter how efficient or not will it be.

It does not have to be a real change in endianness. Even if it's just an interface, and physically everything is handled in the processors own endianness, it's perfectly acceptable to me.

like image 960
vsz Avatar asked Jul 18 '11 11:07

vsz


People also ask

Does endianness affect structs?

Endianess shouldn't have an effect. The offset of the first element of a struct should always be zero. The offset of every next element should be larger than that of its predecessor.

What determines the endianness?

Broadly speaking, the endianness in use is determined by the CPU. Because there are a number of options, it is unsurprising that different semiconductor vendors have chosen different endianness for their CPUs.

What is meant by endianness in C?

Endianness. The attribute of a system that indicates whether integers are represented with the most significant byte stored at the lowest address (big endian) or at the highest address (little endian). Each address stores one element of the memory array.

How can unions detect endianness?

We can also check the endianness of the machine using the union. We need to create a union that has an integer variable and an array of 4 characters. If the first element (au8DataBuff [0]) of the character array is equal to the LSB Bytes of integer, then the system will be little endian otherwise big-endian.


2 Answers

The way I usually handle this is like so:

#include <arpa/inet.h> // for ntohs() etc. #include <stdint.h>  class be_uint16_t { public:         be_uint16_t() : be_val_(0) {         }         // Transparently cast from uint16_t         be_uint16_t(const uint16_t &val) : be_val_(htons(val)) {         }         // Transparently cast to uint16_t         operator uint16_t() const {                 return ntohs(be_val_);         } private:         uint16_t be_val_; } __attribute__((packed)); 

Similarly for be_uint32_t.

Then you can define your struct like this:

struct be_fixed64_t {     be_uint32_t int_part;     be_uint32_t frac_part; } __attribute__((packed)); 

The point is that the compiler will almost certainly lay out the fields in the order you write them, so all you are really worried about is big-endian integers. The be_uint16_t object is a class that knows how to convert itself transparently between big-endian and machine-endian as required. Like this:

be_uint16_t x = 12; x = x + 1; // Yes, this actually works write(fd, &x, sizeof(x)); // writes 13 to file in big-endian form 

In fact, if you compile that snippet with any reasonably good C++ compiler, you should find it emits a big-endian "13" as a constant.

With these objects, the in-memory representation is big-endian. So you can create arrays of them, put them in structures, etc. But when you go to operate on them, they magically cast to machine-endian. This is typically a single instruction on x86, so it is very efficient. There are a few contexts where you have to cast by hand:

be_uint16_t x = 37; printf("x == %u\n", (unsigned)x); // Fails to compile without the cast 

...but for most code, you can just use them as if they were built-in types.

like image 131
Nemo Avatar answered Oct 02 '22 03:10

Nemo


A bit late to the party but with current GCC (tested on 6.2.1 where it works and 4.9.2 where it's not implemented) there is finally a way to declare that a struct should be kept in X-endian byte order.

The following test program:

#include <stdio.h> #include <stdint.h>  struct __attribute__((packed, scalar_storage_order("big-endian"))) mystruct {     uint16_t a;     uint32_t b;     uint64_t c; };   int main(int argc, char** argv) {     struct mystruct bar = {.a = 0xaabb, .b = 0xff0000aa, .c = 0xabcdefaabbccddee};      FILE *f = fopen("out.bin", "wb");     size_t written = fwrite(&bar, sizeof(struct mystruct), 1, f);     fclose(f); } 

creates a file "out.bin" which you can inspect with a hex editor (e.g. hexdump -C out.bin). If the scalar_storage_order attribute is suppported it will contain the expected 0xaabbff0000aaabcdefaabbccddee in this order and without holes. Sadly this is of course very compiler specific.

like image 36
Niklas Schnelle Avatar answered Oct 02 '22 03:10

Niklas Schnelle