Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using readClassDescriptor() and maybe resolveClass() to permit Serialization versioning

I am investigating different options in the Java Serialization mechanism to allow flexibility in our class structures for version-tolerant storage (and advocating for a different mechanism, you don't need to tell me).

For instance, the default serialization mechanism can handle both adding and removing fields, if only backwards compatibility is required.

Renaming a class or moving it to a different package has proved to be much harder, though. I found in this question that I was able to do a simple class rename and/or move package, by subclassing ObjectInputStream and overriding readClassDescriptor():

    if (resultClassDescriptor.getName().equals("package.OldClass"))
        resultClassDescriptor = ObjectStreamClass.lookup(newpackage.NewClass.class);

That is fine for simple renames. But if you then try to add or delete a field, you get a java.io.StreamCorruptedException. Worse, this happens even if a field has been added or deleted, and then you rename the class, which could cause problems with multiple developers or multiple checkins.

Based on some reading I had done, I experimented a bit with also overriding resolveClass(), with the idea that we were correctly repointing the name to the new class, but not loading the old class itself and bombing on the field changes. But this comes from a very vague understanding of some the details of the Serialization mechanism, and I'm not sure if I'm even barking up the right tree.

So 2 precise questions:

  1. Why is repointing the class name using readClassDescriptor() causing deserialization to fail on normal, compatible class changes?
  2. Is there a way using resolveClass() or another mechanism to get around this and allow classes to both evolve (add and remove fields) and be renamed/repackaged?

I poked around and could not find an equivalent question on SO. By all means, point me to such a question if it exists, but please read the question carefully enough that you do not close me unless another question actually answers my precise question.

like image 552
orbfish Avatar asked Jun 07 '12 17:06

orbfish


People also ask

What is version tolerance serialization in Java?

Version Tolerant Serialization (VTS) is a set of features that makes it easier, over time, to modify serializable types. Specifically, the VTS features are enabled for classes to which the SerializableAttribute attribute has been applied, including generic types.

Which class to serialize and deserialize?

Some users may need to control which class to serialize and deserialize because a different version of the class is required on the server and client. SerializationBinder is an abstract class used to control the actual types used during serialization and deserialization.

Are serializable types reusable from one version to the next?

In the earliest versions of .NET Framework, creating serializable types that would be reusable from one version of an application to the next was problematic. If a type was modified by adding extra fields, the following problems would occur:

What is version tolerant serialization (VTS)?

Newer versions of an application would throw exceptions when deserializing older versions of a type with missing data. Version Tolerant Serialization (VTS) is a set of features that makes it easier, over time, to modify serializable types.


2 Answers

I had same problems with flexibility like you and I found the way. So here my version of readClassDescriptor()

    static class HackedObjectInputStream extends ObjectInputStream
{

    /**
     * Migration table. Holds old to new classes representation.
     */
    private static final Map<String, Class<?>> MIGRATION_MAP = new HashMap<String, Class<?>>();

    static
    {
        MIGRATION_MAP.put("DBOBHandler", com.foo.valueobjects.BoardHandler.class);
        MIGRATION_MAP.put("DBEndHandler", com.foo.valueobjects.EndHandler.class);
        MIGRATION_MAP.put("DBStartHandler", com.foo.valueobjects.StartHandler.class);
    }

    /**
     * Constructor.
     * @param stream input stream
     * @throws IOException if io error
     */
    public HackedObjectInputStream(final InputStream stream) throws IOException
    {
        super(stream);
    }

    @Override
    protected ObjectStreamClass readClassDescriptor() throws IOException, ClassNotFoundException
    {
        ObjectStreamClass resultClassDescriptor = super.readClassDescriptor();

        for (final String oldName : MIGRATION_MAP.keySet())
        {
            if (resultClassDescriptor.getName().equals(oldName))
            {
                String replacement = MIGRATION_MAP.get(oldName).getName();

                try
                {
                    Field f = resultClassDescriptor.getClass().getDeclaredField("name");
                    f.setAccessible(true);
                    f.set(resultClassDescriptor, replacement);
                }
                catch (Exception e)
                {
                    LOGGER.severe("Error while replacing class name." + e.getMessage());
                }

            }
        }

        return resultClassDescriptor;
    }
like image 129
gaponov Avatar answered Dec 18 '22 19:12

gaponov


The problem is that readClassDescriptor is supposed to tell the ObjectInputStream how to read the data which is currently in the stream you are reading. if you look inside a serialized data stream, you will see that it not only stores the data, but lots of metadata about exactly what fields are present. this is what allows serialization to handle simple field additions/removals. however, when you override that method and discard the info returned from the stream, you are discarding the info about what fields are in the serialized data.

i think the solution to the problem would be to take the value returned by super.readClassDescriptor() and create a new class descriptor which returns the new class name, but otherwise returns the info from the old descriptor. (although, in looking at ObjectStreamField, it may be more complicated than that, but that is the general idea).

like image 37
jtahlborn Avatar answered Dec 18 '22 19:12

jtahlborn