Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does data serialization do?

I'm having a hard time understanding what serialization is and does.

Let me simplify my problem. I have a struct info in my c/c++ programs, and I may store this struct data into a file save.bin or send it via socket to another computer.

struct info {
    std::string name;
    int age;
};

void write_to_file()
{
    info a = {"Steve", 10};
    ofstream ofs("save.bin", ofstream::binary);
    ofs.write((char *) &a, sizeof(a));   // am I doing it right?
    ofs.close();
}

void write_to_sock()
{
    // I don't know about socket api, but I assume write **a** to socket is similar to file, isn't it?
}

write_to_file will simply save the struct info object a to disk, making this data persistent, right? And write it to socket is pretty much the same, right?

In the above code, I don't think I used data serialization, but the data a is made persistent in save.bin anyway, right?

Question

  1. Then what's the point of serialization? Do I need it here? If yes, how should I use it?

  2. I always think that any kind of files, .txt/.csv/.exe/..., are bits of 01 in memory, which means they have binary representation naturally, so can't we simply send these files via socket directly?

Code example is highly appreciated.

like image 275
Alcott Avatar asked Aug 17 '12 07:08

Alcott


1 Answers

but the data a is made persistent in save.bin anyway, right?

No! Your struct contains an std::string. The exact implementation (and the binary data you get with a cast to char* is not defined by the standard, but the actual string data will always resign somewhere outside of the class frame, heap-allocated, so you can't save that data this easily. With properly done serialisation, the string data is written to where the rest of the class also end up, so you will be able to read it back from a file. That's what you need serialisation for.

How to do it: you have to encode the string in some way, the easiest way is to first write its length, then the string itself. On reading back the file, first read back the length, then read that amount of bytes into a new string object.

I always think that any kind of files, .txt/.csv/.exe/..., are bits of 01 in memory

Yes, but the problem is that it's not universally defined which bit represents what part of a data structure. In particular, there are little-endian and big-endian architectures, they store the bits "the other way around". If you naïvely read out a file written in a mismatching architecture, you will obviously get garbage.

like image 189
leftaroundabout Avatar answered Nov 27 '22 17:11

leftaroundabout