Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serialize/unserialize PHP object-graph to JSON

I wanted to serialize a complete PHP object-graph to a JSON string representation, and unserialize it back to an identical PHP object-graph.

Here is a summary of options I considered, and reasons why they don't work for me:

  • serialize() doesn't do what I want, because it uses a format specific to PHP. I want a format that is widely supported by most languages, and human-readable/editable.

  • json_encode() doesn't do what I want, because it only does simple values and arrays, not objects. (I'm actually using this in my implementation, see below.)

  • var_export() doesn't handle circular references, and doesn't do what I want (see above.) (note that my current implementation does not handle circular references either - see comments and reply below for clarification of this issue.)

  • Sebastian Bergmann's Object Freezer is a nice implementation, but it doesn't do what I want either - it uses a very long form, and relies on stuffing serialized objects with GUIDs.

  • Serialized doesn't do what I want - it does not actually perform serialization, it parses the output of serialize() and produces a different representation, e.g. XML, but is unable to parse that representation. (it also does not support JSON - XML is very long form, and is not what I want.)

I now have a working implementation to share:

https://github.com/mindplay-dk/jsonfreeze

The JSON-representation of the object-graph looks like this:

{
    "#type": "Order",
    "orderNo": 123,
    "lines": [{
        "#type": "OrderLine",
        "item": "milk \"fuzz\"",
        "amount": 3,
        "options": null
    }, {
        "#type": "OrderLine",
        "item": "cookies",
        "amount": 7,
        "options": {
            "#type": "#hash",
            "flavor": "chocolate",
            "weight": "1\/2 lb"
        }
    }],
    "paid": true
}

This approach is designed to work for a pure tree-structure aggregate - circular references are not allowed, nor multiple references to the same objects. In other words, this is not general-purpose like e.g. serialize() and unserialize() which function for any PHP object-graph.

In my initial approach I used a serialized form that was essentially a base-0 list of objects. The first object in the list (number 0) is the root of the serialized object-graph, any other objects are stored in the order they're found.

In the current implementation, the JSON representation resembles the original tree-structure to the extend that this is possible, making it possible to actually work with the JSON representation of an object-graph in JavaScript. The only deviation is the magic #type property (prefixed with # to prevent collision with property-names) and the #hash "type", used to distinguish array-type hashes (stored as JSON objects) from regular array-type arrays (stored as JSON arrays).


I'm leaving these notes about the previous version here for historical purposes.

Circular references are handled simply by never storing nested objects inside the serialized representation of each object - instead, any object-reference is stored as a JSON-object with the object-index - e.g. {"__oref":2} is a reference to the object with index 2 in the object-list.

I'm having a problem with array-references in my implementation - when I var_dump() inside the code that restores references to objects to the array, they are being populated, but at some point the array gets copied, and you end up with the empty copy. I've tried placing & characters everywhere in the code, but regardless of where I pass by reference, the end-result is an empty array.

like image 295
mindplay.dk Avatar asked May 07 '12 22:05

mindplay.dk


1 Answers

The finished script (posted above) meets my precise requirements:

  • Serialize and unserialize an entire aggregate.

  • Have a JSON representation that closely resembles the original data-structure.

  • Do not pollute the data-structure with dynamically generated keys or other data.

It does not handle circular references. As pointed out in a comment above there is no right way to store circular references or multiple references to the same object, as these are all equal. Realizing this, I decided my object-graph must be a regular tree, and accepted this limitation as "a good thing".

update: the ouput can now be formatted with indentation, newlines and whitespace - it was important for me to have a human-readable (and source-control friendly) representation for my purposes. (The formatting can be enabled or disabled as needed.)

like image 72
mindplay.dk Avatar answered Nov 14 '22 23:11

mindplay.dk