Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is serialization all about?

Where exactly does serialization comes into the picture? I read about serializtion on the 'net and I have come to know that

it is an interface that if implements in a class, means that it can be automatically be serialized and deserialized by the different serializers.

Give me a good reason why and when would a class needs to be serialized? Suppose once it's serialized, what happens exactly?

like image 569
FosterZ Avatar asked Oct 21 '10 05:10

FosterZ


1 Answers

Serialization is needed whenever an object needs to be persisted or transmitted beyond the scope of its existence.

Persistence is the ability to save an object somewhere and load it later with the same state. For example:

  • You might need to store an object instance on disk as part of a file.
  • You might need to store an object in a database as a blob (binary large object).

Transmission is the ability to send an object outside of its original scope to some receiver. For example:

  • You might need to transmit an instance of an object to a remote machine.
  • You might need to transmit an instance to another AppDomain or process on the same machine.

For each of these, there must be some serial bit representation that can be stored, communicated, and then later used to reconstitute the original object. The process of turning an object into this series of bits is called "serialization", while the process of turning the series of bits into the original object is called "deserialization".

The actual representation of the object in serialized form can differ depending on what your goals are. For example, in C#, you have both XML serialization (via the XmlSerializer class) and binary serialization (through use of the BinaryFormatter class). Depending on your needs, you can even write your own custom serializer to do additional work such as compression or encryption. If you need a language- and platform-neutral serialization format, you can try Google's Protocol Buffers which now has support for .NET (I have not used this).

The XML representation mentioned above is good for storing an object in a standard format, but it can be verbose and slow depending on your needs. The binary representation saves on space but isn't as portable across languages and runtimes as XML is. The important point is that the serializer and deserializer must understand each other. This can be a problem when you start introducing backward and forward compatibility and versioning.

An example of potential serialization compatibility issues:

  • You release version 1.0 of your program which is able to serialize some Foo object to a file.
  • The user does some action to save his Foo to a file.
  • You release version 2.0 of your program with an updated Foo.
  • The user tries to open the version 1.0 file with your version 2.0 program.

This can be troublesome if the version 2.0 Foo has additional properties that the version 1.0 Foo didn't. You have to either explicitly not support this scenario or have some versioning story with your serialization. .NET can do some of this for you. In this case, you might also have the reverse problem: the user might try to open a version 2.0 Foo file with version 1.0 of your program.

I have not used these techniques myself, but .NET 2.0 and later has support for version tolerant serialization to support both forward and backward compatibility:

  • Tolerance of extraneous or unexpected data. This enables newer versions of the type to send data to older versions.
  • Tolerance of missing optional data. This enables older versions to send data to newer versions.
  • Serialization callbacks. This enables intelligent default value setting in cases where data is missing.
like image 159
Chris Schmich Avatar answered Oct 19 '22 05:10

Chris Schmich