Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serialization of objects: no thread state can be involved, right?

I am looking hard at the basic principles of storing the state of an executing program to disk, and bringing it back in again. In the current design that we have, each object (which is a C-level thingy with function pointer lists, kind of low-level home-made object-orientation -- and there are very good reasons for doing it this way) will be called to export its explicit state to a writable and restorable format. The key property to make this work is that all state related to an object is indeed encapsulated in the object data structures.

There are other solutions where you work with active objects, where there is a user-level thread attached to some objects. And thus, the program counter, register contents, and stack contents suddenly become part of the program state. As far as I can see, there is no good way to serialize such things to disk at an arbitrary point in time. The threads have to go park themselves in some special state where nothing is represented by the program counter et al, and thus basically "save" their execution state machine state to the explicit object state.

I have looked at a range of serialization libraries, and as far as I can tell this is a universal property.

The core quesion is this: Or is this actually not so? Are there save/restore solutions out there that can include thread state, in terms of where in its code a thread is executing?

Note that saving an entire system state in a virtual machine does not count, that is not really serializing the state, but just freezing a machine and moving it. It is an obvious solution, but a bit heavyweight most of the time.

Some questions made it clear that I was not clear enough in explaining the idea of how we do things. We are working on a simulator system, with very strict rules for code running inside it is allowed to be written. In particular, we make a complete divide between object construction and object state. The interface function pointers are recreated every time you set up the system, and are not part of the state. The state only consists of specific appointed "attributes" that each have a defined get/set function that converts between internal runtime representation and storage representation. For pointers between objects, they are all converted to names. So in our design, an object might come out like this in storage:

Object foo {
  value1: 0xff00ff00;
  value2: 0x00ffeedd;
  next_guy_in_chain: bar;
}

Object bar {
  next_guy_in_chain: null;
}

Linked lists are never really present in the simulation structure, each object represents a unit of hardware of some kind.

The problem is that some people want to do this, but also have threads as a way to code behavior. "Behavior" here is really mutation of the state of the simulation units. Basically, the design we have says that all such changeds have to be made in atomic complete operations that are called, do their work, and return. All state is stored in the objects. You have a reactive model, or it could be called "run to completion", or "event driven".

The other way of thinking about this is to have objects have active threads working on them, which sit in an eternal loop in the same way as classic Unix threads, and never terminate. This is the case that I am trying to see if it can be reasonable stored to disk, but it does not seem like that is feasible without interposing a VM underneath.

Update, October 2009: A paper related to this was published at the FDL conference in 2009, see this paper about checkpointing and SystemC.

like image 412
jakobengblom2 Avatar asked Oct 08 '08 18:10

jakobengblom2


People also ask

What happens if the object to be serialized?

To serialize an object means to convert its state to a byte stream so way that the byte stream can be reverted back into a copy of the object. A Java object is serializable if its class or any of its superclasses implements either the java.

Can we serialize thread in Java?

The Java system supports the transmission of code via dynamic class loading, and the transmission or storage of data via object serialization. However, Java does not provide any mechanism for the transmission/storage of computation (i.e., thread serialization).

What is serialization in multithreading?

Serialization is used any time you need to take an object, and convert it to a serialized form, so it can be stored for later use (like saving a document), or transmitted and used on a different JVM.

What happens if object is not serialized?

What happens if you try to send non-serialized Object over network? When traversing a graph, an object may be encountered that does not support the Serializable interface. In this case the NotSerializableException will be thrown and will identify the class of the non-serializable object.


1 Answers

I don't think serializing only "some threads" of a program can work, since you will run into problems with synchronization (some of the problems are described here http://java.sun.com/j2se/1.3/docs/guide/misc/threadPrimitiveDeprecation.html ). So persisting your whole program is the only viable way to get a consistent state.

What you might look into is orthogonal persistence. There are some prototypical implementations:

http://research.sun.com/forest/COM.Sun.Labs.Forest.doc.external_www.PJava.main.html

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.17.7429

But none of them are maintained anymore or have gained a lot of attraction (afaik). I guess checkpointing is not the best solution after all. In my own project http://www.siebengeisslein.org I am trying the approach of using lightweight transactions to dispatch an event so thread state does not have to be maintained (since at the end of a transaction, the thread callstack is empty again, and if a operation is stopped in mid-transaction, everything is rolled back, so the thread callstack does matter as well). You probably can implement something similar with any OODBMS.

Another way to look at things are continuations (http://en.wikipedia.org/wiki/Continuation , http://jauvm.blogspot.com/). They are a way to suspend execution at defined code locations (but they are not necessarily persisting the thread state).

I hope this gives you some starting points (but there is no ready-to-use solution to this afaik).

EDIT: After reading your clarifications: You should definitely look into OODBMS. Dispatch each event in its own transaction and don't care about threads.

like image 72
jiriki Avatar answered Oct 05 '22 22:10

jiriki