Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Object Serialization Performance tips

I must serialize a huge tree of objects (7,000) into disk. Originally we kept this tree in a database with Kodo, but it would make thousands upon thousands of Queries to load this tree into memory, and it would take a good part of the local universe available time.

I tried serialization for this and indeed I get a performance improvement. However, I get the feeling that I could improve this by writing my own, custom serialization code. I need to make loading this serialized object as fast as possible.

In my machine, serializing / deserializing these objects takes about 15 seconds. When loading them from the database, it takes around 40 seconds.

Any tips on what could I do to improve this performance, taking into consideration that because objects are in a tree, they reference each other?

like image 297
Mario Ortegón Avatar asked Mar 02 '09 12:03

Mario Ortegón


3 Answers

Don't forget to use the 'transient' key word for instance variables that don't have to be serialized. This gives you a performance boost because you are no longer reading/writing unnecessary data.

like image 164
dogbane Avatar answered Oct 13 '22 00:10

dogbane


One optimization is customizing the class descriptors, so that you store the class descriptors in a different database and in the object stream you only refer to them by ID. This reduces the space needed by the serialized data. See for example how in one project the classes SerialUtil and ClassesTable do it.

Making classes Externalizable instead of Serializable can give some performance benefits. The downside is that it requires lots of manual work.

Then there are other serialization libraries, for example jserial, which can give better performance than Java's default serialization. Also, if the object graph does not include cycles, then it can be serialized a little bit faster, because the serializer does not need to keep track of objects it has seen (see "How does it work?" in jserial's FAQ).

like image 34
Esko Luontola Avatar answered Oct 13 '22 01:10

Esko Luontola


I would recomend you to implement custom writeObject() and readObject() methods. In this way you will able eleminate writting chidren nodes for each node in a tree. When you use default serialization, each node will be serialized with all it's children.

For example, writeObject() of a Tree class should iterate through the all nodes of a tree and only write nodes data (without Nodes itself) with some markers, which identifies tree level.

You can look at LinkedList, to see how this methods implemented there. It uses the same approach in order to prevent writting prev and next entries for each single entry.

like image 21
Andrey Vityuk Avatar answered Oct 13 '22 02:10

Andrey Vityuk