I must serialize a huge tree of objects (7,000) into disk. Originally we kept this tree in a database with Kodo, but it would make thousands upon thousands of Queries to load this tree into memory, and it would take a good part of the local universe available time.
I tried serialization for this and indeed I get a performance improvement. However, I get the feeling that I could improve this by writing my own, custom serialization code. I need to make loading this serialized object as fast as possible.
In my machine, serializing / deserializing these objects takes about 15 seconds. When loading them from the database, it takes around 40 seconds.
Any tips on what could I do to improve this performance, taking into consideration that because objects are in a tree, they reference each other?
Don't forget to use the 'transient' key word for instance variables that don't have to be serialized. This gives you a performance boost because you are no longer reading/writing unnecessary data.
One optimization is customizing the class descriptors, so that you store the class descriptors in a different database and in the object stream you only refer to them by ID. This reduces the space needed by the serialized data. See for example how in one project the classes SerialUtil and ClassesTable do it.
Making classes Externalizable instead of Serializable can give some performance benefits. The downside is that it requires lots of manual work.
Then there are other serialization libraries, for example jserial, which can give better performance than Java's default serialization. Also, if the object graph does not include cycles, then it can be serialized a little bit faster, because the serializer does not need to keep track of objects it has seen (see "How does it work?" in jserial's FAQ).
I would recomend you to implement custom writeObject() and readObject() methods. In this way you will able eleminate writting chidren nodes for each node in a tree. When you use default serialization, each node will be serialized with all it's children.
For example, writeObject() of a Tree class should iterate through the all nodes of a tree and only write nodes data (without Nodes itself) with some markers, which identifies tree level.
You can look at LinkedList, to see how this methods implemented there. It uses the same approach in order to prevent writting prev and next entries for each single entry.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With