There are many situations (especially in low-level programming), where the binary layout of the data is important. For example: hardware/driver manipulation, network protocols, etc.
In C++ I can read/write arbitrary binary structures using char*
and bitwise operations (masks and shifts), but that's tedious and error-prone. Obviously, I try to limit the scope of these operations and encapsulate them in higher-level APIs, but it's still a pain.
C++ bitfields seem to offer a developer-friendly solution to this problem, but unfortunately their storage is implementation specific.
NathanOliver mentionned std::bitset
which basically allows you to access individual bits of an integer with a nice operator[]
but lacks accessors for multi-bit fields.
Using meta-programming and/or macros, it's possible to abstract the bitwise operations in a library. Since I don't want to reinvent the wheel, I'm looking for a (preferably STL or boost) library that does that.
For the record, I'm looking into this for a DNS resolver, but the problem and its solution should be generic.
Edit: short answer: it turns out bitfield's storage is reliable in practice (even if it's not mandated by the standard) since system/network libraries use them and yeild well behaved programs when compiled with mainstream compilers.
From the C++14 standard (N3797 draft), section 9.6 [class.bit], paragraph 1:
Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit. [ Note: Bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. — end note ]
Although notes are non-normative, every implementation I'm aware of uses one of two layouts: either big-endian or little endian bit order.
Note that:
<cstdint>
).For examples, look in netinet/tcp.h
and other nearby headers.
Edit by OP: for example tcp.h
defines
struct
{
u_int16_t th_sport; /* source port */
u_int16_t th_dport; /* destination port */
tcp_seq th_seq; /* sequence number */
tcp_seq th_ack; /* acknowledgement number */
# if __BYTE_ORDER == __LITTLE_ENDIAN
u_int8_t th_x2:4; /* (unused) */
u_int8_t th_off:4; /* data offset */
# endif
# if __BYTE_ORDER == __BIG_ENDIAN
u_int8_t th_off:4; /* data offset */
u_int8_t th_x2:4; /* (unused) */
# endif
// ...
}
And since it works with mainstream compilers, it means bitset's memory layout is reliable in practice.
Edit:
This is portable within one endianness:
struct Foo {
uint16_t x: 10;
uint16_t y: 6;
};
But this may not be because it straddles a 16-bit unit:
struct Foo {
uint16_t x: 10;
uint16_t y: 12;
uint16_t z: 10;
};
And this may not be because it has implicit padding:
struct Foo {
uint16_t x: 10;
};
We have this in production code where we had to port MIPS code to x86-64
https://codereview.stackexchange.com/questions/54342/template-for-endianness-free-code-data-always-packed-as-big-endian
Works well for us.
It's basically a template without any storage, the template arguments specify the position of the relevant bits.
If you need multiple fields, you put multiple specializations of the template together in a union, together with an array of bytes to provide storage.
The template has overloads for assignment of value and a conversion operator to unsigned
for reading the value.
In addition, if the fields are larger than a byte, they are stored in big-endian byte order, which is sometimes useful when implementing cross-platform protocols.
here's a usage example:
union header
{
unsigned char arr[2]; // space allocation, 2 bytes (16 bits)
BitFieldMember<0, 4> m1; // first 4 bits
BitFieldMember<4, 5> m2; // The following 5 bits
BitFieldMember<9, 6> m3; // The following 6 bits, total 16 bits
};
int main()
{
header a;
memset(a.arr, 0, sizeof(a.arr));
a.m1 = rand();
a.m3 = a.m1;
a.m2 = ~a.m1;
return 0;
}
It's simple to implement bit fields with known positions with C++:
template<typename T, int POS, int SIZE>
struct BitField {
T *data;
BitField(T *data) : data(data) {}
operator int() const {
return ((*data) >> POS) & ((1ULL << SIZE)-1);
}
BitField& operator=(int x) {
T mask( ((1ULL << SIZE)-1) << POS );
*data = (*data & ~mask) | ((x << POS) & mask);
return *this;
}
};
The above toy implementation allows for example to define a 12-bit field in a unsigned long long
variable with
unsigned long long var;
BitField<unsigned long long, 7, 12> muxno(&var);
and the generated code to access the field value is just
0000000000000020 <_Z6getMuxv>:
20: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax ; Get &var
27: 48 8b 00 mov (%rax),%rax ; Get content
2a: 48 c1 e8 07 shr $0x7,%rax ; >> 7
2e: 25 ff 0f 00 00 and $0xfff,%eax ; keep 12 bits
33: c3 retq
Basically what you'd have to write by hand
I have written an implementation of bit fields in C++ as a library header file. An example I give in the documentation is that, instead of writing this:
struct A
{
union
{
struct
{
unsigned x : 5;
unsigned a0 : 2;
unsigned a1 : 2;
unsigned a2 : 2;
}
u;
struct
{
unsigned x : 5;
unsigned all_a : 6;
}
v;
};
};
// …
A x;
x.v.all_a = 0x3f;
x.u.a1 = 0;
you can write:
typedef Bitfield<Bitfield_traits_default<> > Bf;
struct A : private Bitfield_fmt
{
F<5> x;
F<2> a[3];
};
typedef Bitfield_w_fmt<Bf, A> Bwf;
// …
Bwf::Format::Define::T x;
BITF(Bwf, x, a) = 0x3f;
BITF(Bwf, x, a[1]) = 0;
There's an alternative interface, under which the last two lines of the above would change to:
#define BITF_U_X_BWF Bwf
#define BITF_U_X_BASE x
BITF(X, a) = 0x3f;
BITF(X, a[1]) = 0;
Using this implementation of bit fields, the traits template parameter gives the programmer a lot of flexibility. Memory is just processor memory by default, or it can be an abstraction, with the programmer providing functions to perform "memory" reads and writes. The abstracted memory is a sequence of elements of any unsigned integral type (chosen by the programmer). Fields can be laid out either from least-to-most or most-to-least significance. The layout of fields in memory can be the reverse of what they are in the format structure.
The implementation is located at: https://github.com/wkaras/C-plus-plus-library-bit-fields
(As you can see, I unfortunately was not able to fully avoid use of macros.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With