Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clojure data structure serialization

I have a complex Clojure data structure that I would like to serialize - basically the entire current game state for an online game I am developing so that I can implement save game files.

My requirements are:

  • Some form of human-readable text format (I'd probably prefer s-expressions, JSON and XML in that order but open to others)
  • Support all the usual Clojure data structures, keywords and primitives
  • Ability to provide custom serialization / deserialization functions for custom java classes, defrecords etc. (this is important because I need to do something like Java's readResolve in several cases)
  • Good performance is a nice-to-have

Any good recommendations?

like image 774
mikera Avatar asked Jul 21 '10 16:07

mikera


3 Answers

If you wanted to serialize things to S-expressions, you could use print-dup:

(binding [*print-dup* true] (println [1 2 3]))
; prints [1 2 3]

(defrecord Foo [x])
; => user.Foo
(binding [*print-dup* true] (println (Foo. :foo)))
; prints #=(user.Foo/create {:x :foo})

Note that printing a structure which holds, say, ten references to a single vector followed by reading it back gives you a datastructure with ten separate (not identical?), though equivalent in terms of structure (=) vectors.

To use this in cases where there is no default implementation provided, implement the multimethod clojure.core/print-dup.

Also, a lot of things in Clojure 1.2 are java.io.Serializable:

(every? (partial instance? java.io.Serializable)
        [{1 2} #{"asdf"} :foo 'foo (fn [] :foo)])
; => true

(defrecord Foo [])
(instance? java.io.Serializable (Foo.))
; => true

Note that you should avoid serializing runtime-created fns -- they are instances of one-off classes with weird names and you won't be able to deserialize them after restarting your JVM anyway. With AOT compilation, fns do get their own fixed classnames.

Update: As mentioned in a comment on the question, Serializable is best suited to short-term storage / transfer of data, whereas print-dup should be more robust as a long-term storage solution (working across many versions of the application, Clojure etc.). The reason is that print-dup doesn't in any way depend on the structure of the classes being serialized (so a vector print-dup'd today will still be readable when the vector implementation switches from Java to Clojure's deftype).

like image 54
Michał Marczyk Avatar answered Nov 17 '22 05:11

Michał Marczyk


edn-format has now been released as a standard for data transfer using Clojure's data structures.

It is a pretty good fit for serialising Clojure data structures / values - and is supported across multiple languages so can also be used as a data interchange format.

like image 7
mikera Avatar answered Nov 17 '22 07:11

mikera


If everything is a Clojure data structure, then it's already serialized (b/c of code<->data). Just dump the data structures onto disk. To restore, load them back and (eval).

like image 5
G__ Avatar answered Nov 17 '22 07:11

G__