Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I link two Java serialised objects back together?

Sometimes (quite a lot, actually) we get a situation in Java where two objects are pointing to the same thing. Now if we serialise these separately it is quite appropriate that the serialised forms have separate copies of the object as it should be possible to open one without the other. However if we now deserialise them both, we find that they are still separated. Is there any way to link them back together?

Example follows.

public class Example {

 private static class ContainerClass implements java.io.Serializable {
  private ReferencedClass obj;
  public ReferencedClass get() {
   return obj;
  }
  public void set(ReferencedClass obj) {
   this.obj = obj;
  }
 }

 private static class ReferencedClass implements java.io.Serializable {
  private int i = 0;
  public int get() {
   return i;
  }
  public void set(int i) {
   this.i = i;
  }
 }

 public static void main(String[] args) throws Exception {
  //Initialise the classes
  ContainerClass test1 = new ContainerClass();
  ContainerClass test2 = new ContainerClass();
  ReferencedClass ref = new ReferencedClass();

  //Make both container class point to the same reference
  test1.set(ref);
  test2.set(ref);

  //This does what we expect: setting the integer in one (way of accessing the) referenced class sets it in the other one
  test1.get().set(1234);
  System.out.println(Integer.toString(test2.get().get()));

  //Now serialise the container classes
  java.io.ObjectOutputStream os = new java.io.ObjectOutputStream(new java.io.FileOutputStream("C:\\Users\\Public\\test1.ser"));
  os.writeObject(test1);
  os.close();
  os = new java.io.ObjectOutputStream(new java.io.FileOutputStream("C:\\Users\\Public\\test2.ser"));
  os.writeObject(test2);
  os.close();

  //And deserialise them
  java.io.ObjectInputStream is = new java.io.ObjectInputStream(new java.io.FileInputStream("C:\\Users\\Public\\test1.ser"));
  ContainerClass test3 = (ContainerClass)is.readObject();
  is.close();
  is = new java.io.ObjectInputStream(new java.io.FileInputStream("C:\\Users\\Public\\test2.ser"));
  ContainerClass test4 = (ContainerClass)is.readObject();
  is.close();

  //We expect the same thing as before, and would expect a result of 4321, but this doesn't happen as the referenced objects are now separate instances
  test3.get().set(4321);
  System.out.println(Integer.toString(test4.get().get()));
 }

}
like image 205
Adam Burley Avatar asked Apr 26 '10 21:04

Adam Burley


3 Answers

The readResolve() method allows this (of course, first you have to define how you're going decide which objects are meant to be "the same"). But much easier would be serializing both objects into the same file - the ObjectOut/InputStream keeps a record of all objects it has serialized/deserialized and will only store and return references to objects it has already seen.

like image 62
Michael Borgwardt Avatar answered Nov 02 '22 14:11

Michael Borgwardt


I've done something like that for an application server / object database that I'm building. What are your requirements - why do you need to do that? If your requirements are something less than an application server, then maybe some other design would solve it easier.


If you anyways want to proceed, here is how to do it:

First you need to hook into the serialization process by overriding the ObjectOutputStream.replaceObject() and ObjectInputStream.resolveObject() methods. See my ObjectSerializer for an example.

When you serialize the objects, you must assign a unique ID for each object instance that you want to have a unique identity - that kind of objects are commonly called entities. When the object refers to other entities, you must replace those other entities with a placeholder object which contains the ID of the referred entity.

Then when the objects are deserialized, you must replace each of those placeholder objects with the real entity object which has that ID. You need to keep track of the entity object instances which have been loaded into memory and their IDs, so that for each ID only once instance is ever created. If the entity is not yet loaded into memory, you must load it from where it was saved. See my EntityManager for an example.

If you want to do lazy loading, to avoid loading the whole object graph into memory when it is not needed, you must do something similar to transparent references. See their implementation here. If you're getting that far, you might as well copy those parts (the packages entities, entities.tref, serial and maybe also context) from my project - it has a permissive license - and modify them to fit your needs (i.e. remove stuff you don't need).

like image 40
Esko Luontola Avatar answered Nov 02 '22 15:11

Esko Luontola


Like the above answers, readResolve is the key, as it allows you to replace the "duplicate" object with the one you want.

Assuming your class implements hashCode() and equals(), you implement deduplication by creating a static WeakHashMap that holds all references to created objects still in memory. e.g.

class ReferencedClass implements Serializable
{
   static private Map<ReferencedClass, Reference<ReferencedClass>> map = new WeakHashMap<ReferencedClass, Reference<ReferencedClass>>;

   static public ReferencedClass findOriginal(ReferencedClass obj)
   {
      WeakReference<ReferencedClass> originalRef = map.get(obj);
      ReferencedClass original = originalRef==null ? null : originalRef.get();
      if (original==null)
      {
          original = obj;
          map.put(original, new WeakReference<ReferencedClass>(original));
      }
      return original;
   }

   static public ReferencedClass()
   {
        findOriginal(this);
   }

   private Object readResolve()
   {
       return findOriginal(this);
   }
}

When deserializing, readResolve() calls RerencedClass.findOriginal(this) to fetch the current original instance. If instances of this class are only created by deserialization, then this will work as is. If you are also constructing objects (using the new operator), then your constructors should also call findOriginal, passing this, so that those objects are also added to the pool.

With these changes in place, both ContainerClass instances will both point to the same ReferenceClass instance, even though they were deserialized indepedently.

like image 30
mdma Avatar answered Nov 02 '22 15:11

mdma