Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do I need to make a type a POD to persist it with a memory-mapped file?

Pointers cannot be persisted directly to file, because they point to absolute addresses. To address this issue I wrote a relative_ptr template that holds an offset instead of an absolute address.

Based on the fact that only trivially copyable types can be safely copied bit-by-bit, I made the assumption that this type needed to be trivially copyable to be safely persisted in a memory-mapped file and retrieved later on.

This restriction turned out to be a bit problematic, because the compiler generated copy constructor does not behave in a meaningful way. I found nothing that forbid me from defaulting the copy constructor and making it private, so I made it private to avoid accidental copies that would lead to undefined behaviour.

Later on, I found boost::interprocess::offset_ptr whose creation was driven by the same needs. However, it turns out that offset_ptr is not trivially copyable because it implements its own custom copy constructor.

Is my assumption that the smart pointer needs to be trivially copyable to be persisted safely wrong?

If there's no such restriction, I wonder if I can safely do the following as well. If not, exactly what are the requirements a type must fulfill to be usable in the scenario I described above?

struct base {
    int x;
    virtual void f() = 0;
    virtual ~base() {} // virtual members!
};

struct derived : virtual base {
    int x;
    void f() { std::cout << x; }
};

using namespace boost::interprocess;

void persist() {
    file_mapping file("blah");
    mapped_region region(file, read_write, 128, sizeof(derived));
    // create object on a memory-mapped file
    derived* d = new (region.get_address()) derived();
    d.x = 42;
    d->f();
    region.flush();
}

void retrieve() {
    file_mapping file("blah");
    mapped_region region(file, read_write, 128, sizeof(derived));
    derived* d = region.get_address();
    d->f();
}

int main() {
    persist();
    retrieve();
}

Thanks to all those that provided alternatives. It's unlikely that I will be using something else any time soon, because as I explained, I already have a working solution. And as you can see from the use of question marks above, I'm really interested in knowing why Boost can get away without a trivially copyable type, and how far can you go with it: it's quite obvious that classes with virtual members will not work, but where do you draw the line?

like image 695
R. Martinho Fernandes Avatar asked Sep 04 '11 17:09

R. Martinho Fernandes


3 Answers

To avoid confusion let me restate the problem.

You want to create an object in mapped memory in such a way that after the application is closed and reopened the file can be mapped once again and object used without further deserialization.

POD is kind of a red herring for what you are trying to do. You don't need to be binary copyable (what POD means); you need to be address-independent.

Address-independence requires you to:

  • avoid all absolute pointers.
  • only use offset pointers to addresses within the mapped memory.

There are a few correlaries that follow from these rules.

  • You can't use virtual anything. C++ virtual functions are implemented with a hidden vtable pointer in the class instance. The vtable pointer is an absolute pointer over which you don't have any control.
  • You need to be very careful about the other C++ objects your address-independent objects use. Basically everything in the standard library may break if you use them. Even if they don't use new they may use virtual functions internally, or just store the address of a pointer.
  • You can't store references in the address-independent objects. Reference members are just syntactic sugar over absolute pointers.

Inheritance is still possible but of limited usefulness since virtual is outlawed.

Any and all constructors / destructors are fine as long as the above rules are followed.

Even Boost.Interprocess isn't a perfect fit for what you're trying to do. Boost.Interprocess also needs to manage shared access to the objects, whereas you can assume that you're only one messing with the memory.

In the end it may be simpler / saner to just use Google Protobufs and conventional serialization.

like image 152
deft_code Avatar answered Nov 02 '22 10:11

deft_code


Yes, but for reasons other than the ones that seem to concern you.

You've got virtual functions and a virtual base class. These lead to a host of pointers created behind your back by the compiler. You can't turn them into offsets or anything else.

If you want to do this style of persistence, you need to eschew 'virtual'. After that, it's all a matter of the semantics. Really, just pretend you were doing this in C.

like image 4
bmargulies Avatar answered Nov 02 '22 10:11

bmargulies


Even PoD has pitfalls if you are interested in interoperating across different systems or across time.

You might look at Google Protocol Buffers for a way to do this in a portable fashion.

like image 2
Steve Dispensa Avatar answered Nov 02 '22 10:11

Steve Dispensa