Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++: how to serialize/deserialize objects without the use of libraries?

I am trying to understand how serialization/deserialization works in C++ without the use of libraries. I started with simple objects but when deserializing a vector, I found out, that I can't get the vector without having written its size first. Moreover, I don't know which file format I should choose, because, if digits exist before vector's size I can't read it right. Furthermore, I want to do that with classes and map containers. My task is to serialize/deserialize an object like this:

PersonInfo
{
    unsigned int    age_;
    string name_;
    enum { undef, man, woman } sex_;
}

Person : PersonInfo 
{
    vector<Person>      children_;
    map<string, PersonInfo>     addrBook_;
}

Currently I know how to serialize simple objects like this:

vector<PersonInfo> vecPersonInfo;
vecPersonInfo.push_back(*personInfo);
vecPersonInfo.push_back(*oneMorePersonInfo);

ofstream file("file", ios::out | ios::binary);
if (!file) {
    cout<<"can not open file";
} else {
    vector<PersonInfo>::const_iterator iterator = vecPersonInfo.begin();
    for (; iterator != vecPersonInfo.end(); iterator++) {
        file<<*iterator;
    }

Could you please suggest, how can I do this for this complex object or a good tutorial that explains it clearly?

like image 484
Winte Winte Avatar asked Jul 10 '12 14:07

Winte Winte


2 Answers

One pattern is to implement an abstract class the defines functions for serialization and the class defines what goes into the serializer and what comes out. An example would be:

class Serializable
{
public:
    Serializable(){}
    virtual ~Serializable(){}

    virtual void serialize(std::ostream& stream) = 0;
    virtual void deserialize(std::istream& stream) = 0;
};

You then implement Serializable interface for the class/struct that you want to serialize:

struct PersonInfo : public Serializable // Yes! It's possible
{
    unsigned int age_;
    string name_;
    enum { undef, man, woman } sex_;

    virtual void serialize(std::ostream& stream)
    {
        // Serialization code
        stream << age_ << name_ << sex_;
    }

    virtual void deserialize(std::istream& stream)
    {
        // Deserialization code
        stream >> age_ >> name_ >> sex_;
    }
};

Rest I believe you know. Here's a few hurdles to pass though and can be done in your leisure:

  1. When you write a string to the stream with spaces in it and try to read it back, you will get only one portion of it and rest of the string 'corrupts' the values read after that.
  2. How can you program it such that it's cross-platform (little-endian vs big-endian)
  3. How can your program automatically detect, which class to create when deserializing.

Clues:

  1. Use custom serializer that has functions to write bool, int, float, strings, etc.
  2. Use a string to represent the object type being serialized and use factory to create an instance of that object when deserializing.
  3. Use predefined macros to determine which platform your code is being compiled.
  4. Always write files in a fixed endian and make the platforms that use the other endianess adjust to that.
like image 75
Vite Falcon Avatar answered Oct 16 '22 20:10

Vite Falcon


The most basic form is to define a "Serialisable" interface (abstract class) that defines virtual read/write methods. You also define a "Stream" interface that provides a common API for basic primitive types (e.g. reading/writing of ints, floats, bytes, chars, seek/reset) and maybe for some compound types (arrays of values e.g. for strings, vectors, etc.) which operates on a stream. You can use the C++ IOStreams if it suits you.

You also will need to have some id system for a factory to create the corresponding class when loading/deserialising, and for referencing when serializing complex types so that each logical part is tagged/header-ed with proper structure/length information when necessary.

Then you can create concrete Stream classes for each medium (like Text File, Binary File, In Memory, Network, etc).

Each class you want to be serializable then has to inherit the Serializable interface and implement the details (recursively leveraging serializable interfaces defined for other types if a compound/complex class).

This is of course a naive and "intrusive" way of adding serialisation (where you must modify the participating classes). You can then use template or preprocessor tricks to make it less intrusive. See Boost or protocol buffers, or any other library for ideas on how this might look in code.

You really sure you want to roll your own? It can get really messy, especially when you have pointers, pointers between objects (including cycles), which you also need to fix up/translate at some point before a load/deserialisation is correct for the current run.

like image 28
Preet Kukreti Avatar answered Oct 16 '22 19:10

Preet Kukreti