I've recently been running some benchmarks trying to find the "best" serialization frameworks for C++ and also in Java. The factors that make up "best" for me are the speed of de/serializing and also the resulting size of the serialized object.
If I look at the results of various frameworks in Java, I see that the resulting byte[] is generally smaller than the object size in memory. This is even the case with the built in Java serialization. If you then look at some of the other offerings (protobuf etc.) the size decreases even more.
I was quite surprised that when I looked at things on the C++ size (boost, protobuf) that the resulting object is generally no smaller (and in some cases bigger) than the original object.
Am I missing something here? Why do I get a fair amount of "compression" for free in Java but not in C++?
n.b for measuring the size of the objects in Java, I'm using Instrumentation http://docs.oracle.com/javase/6/docs/api/java/lang/instrument/Instrumentation.html
Did you compare the absolute size of the data? I would say that Java has more overhead, so if you "compress" the data into a serialized buffer, the amount of overhead decreases a lot more. In C/C++ you have almost the bare minimum required for the physical data size, so there is not much room for compression. And in fact, you have to add additional information to deserialize it, which could even result in a growth.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With