Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C - serialization techniques

I'm writing some code to serialize some data to send it over the network. Currently, I use this primitive procedure:

  1. create a void* buffer
  2. apply any byte ordering operations such as the hton family on the data I want to send over the network
  3. use memcpy to copy the memory into the buffer
  4. send the memory over the network

The problem is that with various data structures (which often contain void* data so you don't know whether you need to care about byte ordering) the code becomes really bloated with serialization code that's very specific to each data structure and can't be reused at all.

What are some good serialization techniques for C that make this easier / less ugly?

-

Note: I'm bound to a specific protocol so I cannot freely choose how to serialize my data.

like image 762
ryyst Avatar asked May 14 '11 14:05

ryyst


People also ask

What is serialization in C?

Serialization is the process of converting an object into a stream of bytes to store the object or transmit it to memory, a database, or a file. Its main purpose is to save the state of an object in order to be able to recreate it when needed. The reverse process is called deserialization.

What are the types of serialization?

There are three types of serialization in . Net : Binary Serialization, SOAP Serialization and XML Serialization.

Which method is used for serialization?

For serializing the object, we call the writeObject() method of ObjectOutputStream class, and for deserialization we call the readObject() method of ObjectInputStream class. We must have to implement the Serializable interface for serializing the object.

What are two popular methods of data serialization?

XML , JSON , BSON, YAML , MessagePack, and protobuf are some commonly used data serialization formats.


2 Answers

For each data structure, have a serialize_X function (where X is the struct name) which takes a pointer to an X and a pointer to an opaque buffer structure and calls the appropriate serializing functions. You should supply some primitives such as serialize_int which write to the buffer and update the output index. The primitives will have to call something like reserve_space(N) where N is the number of bytes that are required before writing any data. reserve_space() will realloc the void* buffer to make it at least as big as it's current size plus N bytes. To make this possible, the buffer structure will need to contain a pointer to the actual data, the index to write the next byte to (output index) and the size that is allocated for the data. With this system, all of your serialize_X functions should be pretty straightforward, for example:

struct X {     int n, m;     char *string; }  void serialize_X(struct X *x, struct Buffer *output) {     serialize_int(x->n, output);     serialize_int(x->m, output);     serialize_string(x->string, output); } 

And the framework code will be something like:

#define INITIAL_SIZE 32  struct Buffer {     void *data;     size_t next;     size_t size; }  struct Buffer *new_buffer() {     struct Buffer *b = malloc(sizeof(Buffer));      b->data = malloc(INITIAL_SIZE);     b->size = INITIAL_SIZE;     b->next = 0;          return b; }  void reserve_space(Buffer *b, size_t bytes) {     if((b->next + bytes) > b->size) {         /* double size to enforce O(lg N) reallocs */         b->data = realloc(b->data, b->size * 2);         b->size *= 2;     } } 

From this, it should be pretty simple to implement all of the serialize_() functions you need.

EDIT: For example:

void serialize_int(int x, Buffer *b) {     /* assume int == long; how can this be done better? */     x = htonl(x);      reserve_space(b, sizeof(int));      memcpy(((char *)b->data) + b->next, &x, sizeof(int));     b->next += sizeof(int); } 

EDIT: Also note that my code has some potential bugs. There is no provision for error handling and no function to free the Buffer after you're done so you'll have to do this yourself. I was just giving a demonstration of the basic architecture that I would use.

like image 114
jstanley Avatar answered Oct 17 '22 07:10

jstanley


I would say definitely don't try to implement serialization yourself. It's been done a zillion times and you should use an existing solution. e.g. protobufs: https://github.com/protobuf-c/protobuf-c

It also has the advantage of being compatible with many other programming languages.

like image 41
Assaf Lavie Avatar answered Oct 17 '22 06:10

Assaf Lavie