Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intermediate representation for a Lisp / Clojure DSL

I'm designing a DSL in Clojure which is used to drive a code generator (in this case for procedural image synthesis - clisk) and am having trouble working out the best representation for intermediate values.

Originally the DSL consisted of functions that returned one or more forms, e.g. (illustrative)

(v+ 1.0 [1.0 'y])
=> ['(+ 1.0 1.0) '(+ 1.0 y)]

These functions could then be composed to build up larger blocks of code.

This was simple and the resulting forms could be fed directly into the code generator. However I've now identified what appear to be some weaknesses with this approach, for example if there is a need to pass around some auxillary data (e.g. objects that cannot be encoded in forms like BufferedImages, metadata useful for optimisation etc.).

I'm sure this is a solved problem in the Lisp world - What would typically be the best intermediate representation for this kind of DSL?

like image 366
mikera Avatar asked May 26 '12 02:05

mikera


2 Answers

Anytime you need an intermediate representation that will be used to generate code, the most obvious thing that comes to my mind is an abstract syntax tree (AST). Your example representation is lists, which in my experience is not as flexible of a form. For any thing more than trivial code generation, I wouldn't beat around the bush, and just go with a full blown AST representation. By using lists, you push off more work to the generation side to parse out the information such as types and what the first item means. Moving to an AST representation will give you more flexibility and decouples more of the system, at the cost of more work from the parsing side (or more work from the functions that generate the forms). The generation side will be doing more work as well, but the many of those components can be decoupled since their inputs will be more structured.

In terms of what should the AST look like, I would copy either Christophe Grand's enlive, where he uses {:tag <tag name> :attrs <map of attrs> :content <some collection>}

or what clojure script uses, {:op <some operator> :children <some collection>}.

This makes it quite general since you can define arbitrary walkers that peek into :children and can traverse any structure without knowing exactly about what the :op's or :tag's are.

Then for atomic components, you can wrap it in a map and give it some type information (with respect to the semantics of your DSL) that's independent of the actual type of the object. {:atom <the object> :type :background-image}.

On the code generation side, when encountering an atom, your code can then dispatch on the :type, and then, if you want, further dispatch on the actual type of the object. Generation from collection forms is also easy, dispatch on :op/:tag, and then recur with the children. For what collection to use for children, I'd read more up on the discussion on the google groups. Their conclusions were enlightening to me.

https://groups.google.com/forum/#!topic/clojure-dev/vZLVKmKX0oc/discussion

To summarize, for children, if there were semantic ordering importance such as in an if statement, then use a map {:conditional z :then y :else x}. If it was just an argument list, then you could use a vector.

like image 80
bmillare Avatar answered Sep 18 '22 13:09

bmillare


I guess I don't understand. I'd just use lists or structures myself.

In Lisp, lists can contain, well, anything. I should say, a CONS cell can point to anything and thus list can contain anything. So can pretty much any other data structure (structures, arrays, maps, etc.).

Now, these structures not be able to be renderable, by PRINT, or rendered in to something readable (by READ), but that doesn't mean they can't be stored and manipulated.

Is there some reason you need to externalize this representation?

like image 32
Will Hartung Avatar answered Sep 20 '22 13:09

Will Hartung