Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Externalization vs Transient

I was thinking about the purpose of Externalisation, given that you could simply label a property as transient and prevent its serialisation. However, upon further research I found out that this approach (i.e. labelling as transient) may not be ideal if you need to decide what's required at run-time. Theoretically, it makes sense to me. However, practically I don't see how Externalisation is more run-time friendly. I mean, you still have to decide what's required or not within the writeExternal() and readExternal() during the definition of the class. So, how is that more run-time friendly?

The document that highlighted this is as follows,

If everything is automatically taken care by implementing the Serializable interface, why would anyone like to implement the Externalizable interface and bother to define the two methods? Simply to have the complete control on the process. OKay... let's take a sample example to understand this. Suppose we have an object having hundreds of fields (non-transient) and we want only few fields to be stored on the persistent storage and not all. One solution would be to declare all other fields (except those which we want to serialize) as transient and the default Serialization process will automatically take care of that. But, what if those few fields are not fixed at design tiime instead they are conditionally decided at runtime. In such a situation, implementing Externalizable interface will probably be a better solution. Similarly, there may be scenarios where we simply don't want to maintain the state of the Superclasses (which are automatically maintained by the Serializable interface implementation).

like image 542
Grateful Avatar asked Dec 25 '22 17:12

Grateful


1 Answers

I would like to point out that there are other advantages/disadvantages to consider when comparing Serializable and Externalizable methods.

Externalizing is faster

During serialization the JVM will always first check if the class is Externalizable. If that's the case then it will use the read/writeExternal methods. (makes sense, right)

Externalizable classes need less recursion, as you can precisely identify what data you need. It also results in a more compact output (less bytes), which brings us to the next point ...

Externalized output is more compact

If you would compare the actual output, it would look something like this: The header of the object contains a flag that marks if the class is just Serializable or maybe also Externalizable.

OBJECT
CLASSDESC
  Class Name: "MyClassName"
  Class UID:  ...
  Class Desc Flags: SERIALIZABLE or EXTERNALIZABLE

If it's just SERIALIZABLE, then a list of fields will follow (like a definition), followed by the actual data. This is repeated for every serialized object.

  Field Count: ...
  // followed by an bunch of declarations of objects
  Field type: object
  Field name: "fieldName"
  Class name: "Ljava/lang/String;"

 // followed by the actual data
 STRING: "foo"
 STRING: "bar"
 float: 123456

Externalizable objects don't contain a list of fields and data, they just contain the encoded data in the order that you saved it.

  EXTERNALIZABLE: [00 AA 00 BC ... ]

Externalizing is more flexible

If you save a shopping list, then you only want the product names, right ?

public class ShoppingList implements Externalizable {
  String name;
  List<Product> productList;     

  @Override
  public void writeExternal(ObjectOutput pOutput) throws IOException
  {
    out.writeUTF(name);
    for (Product product : productList)
    {
      // save only product id
      out.writeUTF(product.getEanCode());
    }
  }
  ...
}

But if you are making a bill, then you also want to save prices right ?

public class Bill implements Externalizable {
  String name;
  List<Product> productList;     

  @Override
  public void writeExternal(ObjectOutput pOutput) throws IOException
  {
    out.writeUTF(name);
    for (Product product : productList)
    {
      // save product id and price
      out.writeUTF(product.getEanCode());
      out.writeInt(product.getPrice());
    }
  }
  ...
}

So, in some cases the price is transient and in some cases it is not. How would you solve this with the transient keyword ? -- I will let you figure this one out. This kind of flexibility is really missing when using only the transient keyword.

Design considerations

However, there are some dangers as well. Externalizable objects can only be implemented for objects with a public default constructor (a public constructor without arguments).

That makes it impossible to make non-static inner classes Externalizable. The problem is that JVM modifies the constructors at runtime, and adds a reference to the parent class during compilation. So you cannot have a default no-argument constructor for a non-static inner class.

You also have to consider the possibility of modifying your object in future (e.g. adding non-transient fields). Serializable classes could have backwards compatibility issues, but don't require code changes per se. Externalizable classes will require a code change in your read/write method, but have more options to handle compatibility issues.

Just one more thing. If you are chosing this "technology" to communicate between different applications, then please just don't. What you want is JAXB. It's less compact, but more transparent, no compatibility issues, and just as flexible.

Hidden features

Just to be complete, there is one more thing which makes this topic a bit more complicated. It's actually possible to use read/write methods without using the Externalizable interface at all. Before Externalizable was introduced, it was possible to define private writeObject and readObject methods. But really, you shouldn't use that method anymore.

like image 90
bvdb Avatar answered Jan 09 '23 11:01

bvdb