I'm writing some code to serialize some data to send it over the network. Currently, I use this primitive procedure:
void*
bufferhton
family on the data I want to send over the networkmemcpy
to copy the memory into the bufferThe problem is that with various data structures (which often contain void* data so you don't know whether you need to care about byte ordering) the code becomes really bloated with serialization code that's very specific to each data structure and can't be reused at all.
What are some good serialization techniques for C that make this easier / less ugly?
-
Note: I'm bound to a specific protocol so I cannot freely choose how to serialize my data.
Serialization is the process of converting an object into a stream of bytes to store the object or transmit it to memory, a database, or a file. Its main purpose is to save the state of an object in order to be able to recreate it when needed. The reverse process is called deserialization.
There are three types of serialization in . Net : Binary Serialization, SOAP Serialization and XML Serialization.
For serializing the object, we call the writeObject() method of ObjectOutputStream class, and for deserialization we call the readObject() method of ObjectInputStream class. We must have to implement the Serializable interface for serializing the object.
XML , JSON , BSON, YAML , MessagePack, and protobuf are some commonly used data serialization formats.
For each data structure, have a serialize_X function (where X is the struct name) which takes a pointer to an X and a pointer to an opaque buffer structure and calls the appropriate serializing functions. You should supply some primitives such as serialize_int which write to the buffer and update the output index. The primitives will have to call something like reserve_space(N) where N is the number of bytes that are required before writing any data. reserve_space() will realloc the void* buffer to make it at least as big as it's current size plus N bytes. To make this possible, the buffer structure will need to contain a pointer to the actual data, the index to write the next byte to (output index) and the size that is allocated for the data. With this system, all of your serialize_X functions should be pretty straightforward, for example:
struct X { int n, m; char *string; } void serialize_X(struct X *x, struct Buffer *output) { serialize_int(x->n, output); serialize_int(x->m, output); serialize_string(x->string, output); }
And the framework code will be something like:
#define INITIAL_SIZE 32 struct Buffer { void *data; size_t next; size_t size; } struct Buffer *new_buffer() { struct Buffer *b = malloc(sizeof(Buffer)); b->data = malloc(INITIAL_SIZE); b->size = INITIAL_SIZE; b->next = 0; return b; } void reserve_space(Buffer *b, size_t bytes) { if((b->next + bytes) > b->size) { /* double size to enforce O(lg N) reallocs */ b->data = realloc(b->data, b->size * 2); b->size *= 2; } }
From this, it should be pretty simple to implement all of the serialize_() functions you need.
EDIT: For example:
void serialize_int(int x, Buffer *b) { /* assume int == long; how can this be done better? */ x = htonl(x); reserve_space(b, sizeof(int)); memcpy(((char *)b->data) + b->next, &x, sizeof(int)); b->next += sizeof(int); }
EDIT: Also note that my code has some potential bugs. There is no provision for error handling and no function to free the Buffer after you're done so you'll have to do this yourself. I was just giving a demonstration of the basic architecture that I would use.
I would say definitely don't try to implement serialization yourself. It's been done a zillion times and you should use an existing solution. e.g. protobufs: https://github.com/protobuf-c/protobuf-c
It also has the advantage of being compatible with many other programming languages.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With