Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory layout of packed classes + inheritence

Tags:

c++

g++

Consider you have two classes, Base and Derived, Derived is publicly inheriting from Base.

#include <iostream>

struct Base {
  public:
  unsigned char a;
  int b;
  Base() : a('a'), b (2) { };
  unsigned char get_a() {
    return a;
  }
  int get_b() {
    return b;
  }
} __attribute__ ((__packed__)) ;

struct __attribute__ ((__packed__)) Derived : public Base {
  public:
  unsigned char c;
  int d;
  Derived() : c('c'), d(4) { };

  unsigned char get_c() {
    return c;
  }
  int get_d() {
    return d;
  }
};

I was doing some experiments on it and these are the findings/questions.

First, you can not pack only one class in the inheritance chain. You have to either pack everything in the chain otherwise nothing.

Second, After packing, memory for both the classes are contiguous to each other. First will be the memory for Base and then for Derived. For e.g. memory layout for Derived would be : 5 bytes for Base and then next 5 bytes for Derived. Is it for certain/correct?

Third, This is related to my work and an interesting problem IMO. In my flow, till certain point, I need to use the Base class pointers and after that certain pointer, I would convert all of the Base class pointers to Derived class pointers by doing mem_copy or whatever. Of course, I would add the necessary value of Derived members. How can I do that?

To solve the third, what are the approaches would you suggest? Without virtual, since virtual would add 4/8 bytes more to the memory.

like image 856
Hemant Bhargava Avatar asked Sep 13 '19 11:09

Hemant Bhargava


2 Answers

Packed attribute is compiler-specific and non-standard. It has existed for decades, nevertheless, and all compilers I have used (gcc, clang, icc, msvc, sun cc) do support it and it behaves as expected: removes all padding.

Another consideration is that if a compiler couldn't handle the packed attribute (whatever the specific syntax is) as expected that compiler would break a lot of existing and fundamental code and, hence, would limit its own usability for no good reason.

It would be ideal for the C++ standard to standardise this functionality. The only difficulty, I guess, is that someone has to write a proposal and see it through to make it into the standard.


Another option is to store each member in an unsigned char array with its natural 1-byte alignment. Padding only exists to satisfy the alignment requirements, and no padding is necessary to align by 1:

#include <cstring>
#include <iostream>
#include <type_traits>

template<class T>
struct PackedMember {
    static_assert(std::is_trivial<T>::value, "T must be a trivial type");
    unsigned char storage_[sizeof(T)];

    PackedMember() noexcept : storage_() // Zero-initialized.
    {}

    PackedMember(T const& t) noexcept {
        set(t);
    }

    T get() const noexcept {
        T t;
        std::memcpy(&t, storage_, sizeof(T));
        return t;
    }

    void set(T const& t) noexcept {
        std::memcpy(storage_, &t, sizeof(T));
    }
};

struct Base {
    PackedMember<char> a = 'a';
    PackedMember<int> b = 2;
};

struct Derived : Base {
    PackedMember<char> c = 'c';
    PackedMember<int> d = 4;
};

int main() {
    Derived d;

    std::cout << sizeof(d) << '\n';
    std::cout << d.a.get() << '\n';
    std::cout << d.b.get() << '\n';
    std::cout << d.c.get() << '\n';
    std::cout << d.d.get() << '\n';
}

Outputs:

10
a
2
c
4

The main idea here is to remove the padding but retain the member names. Accessing members by index would be less than ideal: harder to understand the code; or have to create an enum of indexes and use that to make the code easily readable.

On modern x86-64 CPUs those std::memcpy generate a plain mov instruction, just like for aligned members. On these CPUs the cost of handling unaligned access is 0 (since Sandy Bridge from 2011), unless a cache line boundary is crossed. This is a must-have CPU optimization nowadays for storing and handling large datasets in memory without wasting any memory and cache capacity for padding and not losing any performance due to the lack of alignment. I am not familiar with other CPU architectures to comment on those.

like image 134
Maxim Egorushkin Avatar answered Sep 28 '22 07:09

Maxim Egorushkin


First all of this is highly compiler specific as you are talking about compiler extensions.

The __attribute__ ((__packed__)) in gcc sets alignments to 1 and removes padding. That means that sizeof(Base) should usually be 5 and Derived adds another 5 bytes to that as you suggest (assumes sizeof(int) == 4). This is how packed works in gcc and related compilers like clang. Use a different compiler and all bets are off.

As for converting the pointers you should have a constructor for Derived that takes a const Base & as argument. The compiler should generate the proper memcpy() for that then.

Overall you should have a very good reason to use packed as operating on packed structures will produce highly inefficient code.

like image 43
Goswin von Brederlow Avatar answered Sep 28 '22 06:09

Goswin von Brederlow