Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boost serialization performance: text vs. binary format

Should I prefer binary serialization over ascii / text serialization if performance is an issue?

Has anybody tested it on a large amount of data?

like image 465
Konstantin Avatar asked Jun 29 '09 12:06

Konstantin


3 Answers

I used boost.serialization to store matrices and vectors representing lookup tables and some meta data (strings) with an in memory size of about 200MByte. IIRC for loading from disk into memory it took 3 minutes for the text archive vs. 4 seconds using the binary archive on WinXP.

like image 76
Maik Beckmann Avatar answered Nov 13 '22 20:11

Maik Beckmann


Benchmarked it for a problem involving loading a large class containing lots (thousands) of nested archived classes.

To change the format, use archive streams

boost::archive::binary_oarchive
boost::archive::binary_iarchive

instead of

boost::archive::text_oarchive
boost::archive::text_iarchive

The code for loading the (binary) archive looks like:

std::ifstream ifs("filename", std::ios::binary);
boost::archive::binary_iarchive input_archive(ifs);
Class* p_object;
input_archive >> p_object;

The files and walltimes for an optimised gcc build of the above code snippet are:

  • ascii: 820MB (100%), 32.2 seconds (100%).
  • binary: 620MB (76%), 14.7 seconds (46%).

This is from a solid state drive, without any stream compression.

So the gain in speed is larger than the file size would suggest, and you get an additional bonus using binary.

like image 28
mirams Avatar answered Nov 13 '22 21:11

mirams


I suggest you look into protobuf - Protocol Buffers if performance is an issue

"Protocol Buffers" from .Net

like image 24
jitter Avatar answered Nov 13 '22 21:11

jitter