Should I prefer binary serialization over ascii / text serialization if performance is an issue?
Has anybody tested it on a large amount of data?
I used boost.serialization to store matrices and vectors representing lookup tables and some meta data (strings) with an in memory size of about 200MByte. IIRC for loading from disk into memory it took 3 minutes for the text archive vs. 4 seconds using the binary archive on WinXP.
Benchmarked it for a problem involving loading a large class containing lots (thousands) of nested archived classes.
To change the format, use archive streams
boost::archive::binary_oarchive
boost::archive::binary_iarchive
instead of
boost::archive::text_oarchive
boost::archive::text_iarchive
The code for loading the (binary) archive looks like:
std::ifstream ifs("filename", std::ios::binary);
boost::archive::binary_iarchive input_archive(ifs);
Class* p_object;
input_archive >> p_object;
The files and walltimes for an optimised gcc build of the above code snippet are:
This is from a solid state drive, without any stream compression.
So the gain in speed is larger than the file size would suggest, and you get an additional bonus using binary.
I suggest you look into protobuf - Protocol Buffers if performance is an issue
"Protocol Buffers" from .Net
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With