Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I zero just the padding bytes of a class?

Tags:

c++

padding

I want to set the padding bytes of a class to 0, since I am saving/loading/comparing/hashing instances at a byte level, and garbage-initialised padding introduces non-determinism in each of those operations.

I know that this will achieve what I want (for trivially copyable types):

struct Example
{
    Example(char a_, int b_)
    {
        memset(this, 0, sizeof(*this));
        a = a_;
        b = b_;
    }
    char a;
    int b;
};

I don't like doing that though, for two reasons: I like constructor initialiser lists, and I know that setting the bits to 0 isn't always the same as zero-initialisation (e.g. pointers and floats don't necessarily have zero values that are all 0 bits).

As an aside, it's obviously limited to types that are trivially copyable, but that's not an issue for me since the operations I listed above (loading/saving/comparing/hashing at a byte level) require trivially copyable types anyway.

What I would like is something like this [magical] snippet:

struct Example
{
    Example(char a_, int b_) : a(a_), b(b_)
    {
        // Leaves all members alone, and sets all padding bytes to 0.
        memset_only_padding_bytes(this, 0);
    }
    char a;
    int b;
};

I doubt such a thing is possible, so if anyone can suggest a non-ugly alternative... I'm all ears :)

like image 595
Ben Hymers Avatar asked Oct 23 '13 15:10

Ben Hymers


People also ask

How do you get rid of structure padding?

In Structure, sometimes the size of the structure is more than the size of all structures members because of structure padding. Note: But what actual size of all structure member is 13 Bytes. So here total 3 bytes are wasted. So, to avoid structure padding we can use pragma pack as well as an attribute.

How can we avoid structure padding in C?

We can avoid the structure padding in C in two ways: Using #pragma pack(1) directive. Using attribute.

How many padding bytes are necessary?

The total number of padding bytes is at least one, and is the number that is required in order to bring the data length up to a multiple of the cipher algorithm block size.

What is pragma pack1?

When you use #pragma pack(1) , this changes the default structure packing to byte packing, removing all padding bytes normally inserted to preserve alignment.


3 Answers

There's no way I know of to do this fully automatically in pure C++. We use a custom code generation system to accomplish this (among other things). You could potentially accomplish this with a macro to which you fed all your member variable names; it would simply look for holes between offsetof(memberA)+sizeof(memberA) and offsetof(memberB).

Alternatively, serialize/hash on a memberwise basis, rather than as a binary blob. That's ten kinds of cleaner.

Oh, one other option -- you could provide an operator new which explicitly cleared the memory before returning it. I'm not a fan of that approach, though..... it doesn't work for stack allocation.

like image 67
Sneftel Avatar answered Oct 14 '22 00:10

Sneftel


You should never use padded structs when binary writing/reading them. Simply because the padding can vary from one platform to another which will lead to binary incompatibility.

Use some compiler options, like #pragma pack (push, 1) to disable padding when defining those writable structs and restore it with #pragma pack(pop).

This sadly means you're losing the optimization provided by it. If that is a concern, by carefully designing your structs you can manually "pad" them by inserting dummy variables. Then zero-initialization becomes obvious, you just assign zeros to those dummies. I don't recommend that "manual" approach as it's very error-prone, but as you're using binary blob write you're probably concerned already. But by all means, benchmark unpadded structs before.

like image 25
Agent_L Avatar answered Oct 14 '22 00:10

Agent_L


I faced a similar problem - and simply saying that this is a poor design decision (as per dasblinkenlight's comment) doesn't necessarily help as you may have no control over the hashing code (in my case I was using an external library).

One solution is to write a custom iterator for your class, which iterates through the bytes of the data and skips the padding. You then modify your hashing algorithm to use your custom iterator instead of a pointer. One simple way to do this is to templatize the pointer so that it can take an iterator - since the semantics of a pointer and an iterator are the same, you shouldn't have to modify any code beyond the templatizing.

EDIT: Boost provides a nice library which makes it simple to add custom iterators to your container: Boost.Iterator.

Whichever solution you go for, it is highly preferable to avoid hashing the padding as doing so means that your hashing algorithm is highly coupled with your data structure. If you switch data structures (or as Agent_L mentions, use the same data structure on a different platform which pads differently), then it will produce different hashes. On the other hand, if you only hash the actual data itself, then you will always produce the same hash values no matter what data structure you use later.

like image 29
JBentley Avatar answered Oct 14 '22 00:10

JBentley