I've been lately trying to learn more and generally test Java's serialization for both work and personal projects and I must say that the more I know about it, the less I like it. This may be caused by misinformation though so that's why I'm asking these two things from you all: 1: On byte level, how does serialization know how to match serialized values with some class? One of my problems right here is that I made a small test with ArrayList containing values "one", "two", "three". After serialization the byte array took 78 bytes which seems awfully lot for such low amount of information(19+3+3+4 bytes). Granted there's bound to be some overhead but this leads to my second question: 2: Can serialization be considered a good method for persisting objects at all? Now obviously if I'd use some homemade XML format the persistence data would be something like this <pre class="prettyprint lang-xml prettyprint-override"><code><object> <class="java.util.ArrayList">  <field name="elementData"> <value>One</value> <value>Two</value> <value>Three</value> </field> </object> </code></pre> which, like XML in general, is a bit bloated and takes 138 bytes(without whitespaces, that is). The same in JSON could be <pre class="prettyprint lang-json prettyprint-override"><code>{ "java.util.ArrayList": { "elementData": [ "one", "two", "three" ] } } </code></pre> which is 75 bytes so already slightly smaller than Java's serialization. With these text-based formats it's of course obvious that there has to be a way to represent your basic data as text, numbers or any combination of both. So to recap, how does serialization work on byte/bit level, when it should be used and when it shouldn't be used and what are real benefits of serialization besides that it comes standard in Java?

I would personally try to avoid Java's "built-in" serialization: <ul> <li>It's not portable to other platforms</li> <li>It's not hugely efficient</li> <li>It's fragile - getting it to cope with multiple versions of a class is somewhat tricky. Even changing compilers can break serialization unless you're careful.</li> </ul> For details of what the actual bytes mean, see the Java Object Serialization Specification. There are various alternatives, such as: <ul> <li>XML and JSON, as you've shown (various XML flavours, of course)</li> <li>YAML</li> <li>Facebook's Thrift (RPC as well as serialization)</li> <li> Google Protocol Buffers </li> <li> Hessian (web services as well as serialization)</li> <li>Apache Avro</li> <li>Your own custom format</li> </ul> (Disclaimer: I work for Google, and I'm doing a port of Protocol Buffers to C# as my 20% project, so clearly I think that's a good bit of technology :) Cross-platform formats are almost always more restrictive than platform-specific formats for obvious reasons - Protocol Buffers has a pretty limited set of native types, for example - but the interoperability can be incredibly useful. You also need to consider the impact of versioning, with backward and forward compatibility, etc. The text formats are generally hand-editable, but tend to be less efficient in both space and time. Basically, you need to look at your requirements carefully.

How does Java's serialization work and when it should be used instead of some other persistence technique?

Tags:

java

serialization

I've been lately trying to learn more and generally test Java's serialization for both work and personal projects and I must say that the more I know about it, the less I like it. This may be caused by misinformation though so that's why I'm asking these two things from you all:

1: On byte level, how does serialization know how to match serialized values with some class?

One of my problems right here is that I made a small test with ArrayList containing values "one", "two", "three". After serialization the byte array took 78 bytes which seems awfully lot for such low amount of information(19+3+3+4 bytes). Granted there's bound to be some overhead but this leads to my second question:

2: Can serialization be considered a good method for persisting objects at all? Now obviously if I'd use some homemade XML format the persistence data would be something like this

<object>     <class="java.util.ArrayList">     <!-- Object array inside Arraylist is called elementData -->     <field name="elementData">         <value>One</value>         <value>Two</value>         <value>Three</value>     </field> </object>

which, like XML in general, is a bit bloated and takes 138 bytes(without whitespaces, that is). The same in JSON could be

{     "java.util.ArrayList": {         "elementData": [             "one",             "two",             "three"         ]     } }

which is 75 bytes so already slightly smaller than Java's serialization. With these text-based formats it's of course obvious that there has to be a way to represent your basic data as text, numbers or any combination of both.

So to recap, how does serialization work on byte/bit level, when it should be used and when it shouldn't be used and what are real benefits of serialization besides that it comes standard in Java?

678

asked Dec 09 '08 08:12

Esko

1 Answers

I would personally try to avoid Java's "built-in" serialization:

It's not portable to other platforms
It's not hugely efficient
It's fragile - getting it to cope with multiple versions of a class is somewhat tricky. Even changing compilers can break serialization unless you're careful.

For details of what the actual bytes mean, see the Java Object Serialization Specification.

There are various alternatives, such as:

XML and JSON, as you've shown (various XML flavours, of course)
YAML
Facebook's Thrift (RPC as well as serialization)
Google Protocol Buffers
Hessian (web services as well as serialization)
Apache Avro
Your own custom format

(Disclaimer: I work for Google, and I'm doing a port of Protocol Buffers to C# as my 20% project, so clearly I think that's a good bit of technology :)

Cross-platform formats are almost always more restrictive than platform-specific formats for obvious reasons - Protocol Buffers has a pretty limited set of native types, for example - but the interoperability can be incredibly useful. You also need to consider the impact of versioning, with backward and forward compatibility, etc. The text formats are generally hand-editable, but tend to be less efficient in both space and time.

Basically, you need to look at your requirements carefully.

142

answered Sep 28 '22 12:09

Jon Skeet

Related questions
                            
                                Initialize field before super constructor runs?
                            
                                Run Spring-boot's main using IDE
                            
                                Why does notifyAll() raise IllegalMonitorStateException when synchronized on Integer?
                            
                                Which Java blocking queue is most efficient for single-producer single-consumer scenarios
                            
                                Hibernate count collection size without initializing
                            
                                XML Node to String in Java
                            
                                How to Generate a Sequence Diagram from Java Source Code
                            
                                How to run groovy script in java?
                            
                                How to run a .class file that is part of a package from cmd?
                            
                                How to resolve the AnalysisException: resolved attribute(s) in Spark
                            
                                Annotate anonymous inner class
                            
                                Execute .jar file from a Java program
                            
                                Default Java library path?
                            
                                Does java 11 support android?
                            
                                JPA: JOIN in JPQL
                            
                                Parse a nested JSON using gson
                            
                                Javadoc @return tag comment duplication necessary?
                            
                                Spring’s embedded H2 datasource and DB_CLOSE_ON_EXIT
                            
                                Test two instances of object are equal JUnit [duplicate]
                            
                                HashSet.remove() and Iterator.remove() not working

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With