Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serialization - differences between C++ and Java

I've recently been running some benchmarks trying to find the "best" serialization frameworks for C++ and also in Java. The factors that make up "best" for me are the speed of de/serializing and also the resulting size of the serialized object.

If I look at the results of various frameworks in Java, I see that the resulting byte[] is generally smaller than the object size in memory. This is even the case with the built in Java serialization. If you then look at some of the other offerings (protobuf etc.) the size decreases even more.

I was quite surprised that when I looked at things on the C++ size (boost, protobuf) that the resulting object is generally no smaller (and in some cases bigger) than the original object.

Am I missing something here? Why do I get a fair amount of "compression" for free in Java but not in C++?

n.b for measuring the size of the objects in Java, I'm using Instrumentation http://docs.oracle.com/javase/6/docs/api/java/lang/instrument/Instrumentation.html

like image 901
imrichardcole Avatar asked Oct 05 '22 01:10

imrichardcole


1 Answers

Did you compare the absolute size of the data? I would say that Java has more overhead, so if you "compress" the data into a serialized buffer, the amount of overhead decreases a lot more. In C/C++ you have almost the bare minimum required for the physical data size, so there is not much room for compression. And in fact, you have to add additional information to deserialize it, which could even result in a growth.

like image 58
Devolus Avatar answered Oct 13 '22 09:10

Devolus