Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Evaluating AST (abstract syntax tree) in Clojure

How to evaluate AST with better performance? Currently we create AST as tree where leaf nodes (terminals) are functions of one argument - map of keywords and their values. Terminals are represented with keywords, and functions (non-terminals) can be user (or clojure) defined functions. Full growth method creates tree from non-terminals and terminals:

(defn full-growth
  "Creates individual by full growth method: root and intermediate nodes are
   randomly selected from non-terminals Ns,
   leaves at depth depth are randomly selected from terminals Ts"
  [Ns Ts arity-fn depth]
  (if (<= depth 0)
    (rand-nth Ts)
    (let [n (rand-nth Ns)]
      (cons n (repeatedly (arity-fn n) #(full-growth Ns Ts arity-fn(dec depth)))))))

Example of generated AST:

=> (def ast (full-growth [+ *] [:x] {+ 2, * 2} 3))
#'gpr.symb-reg/ast
=> ast
(#object[clojure.core$_STAR_ 0x6fc90beb "clojure.core$_STAR_@6fc90beb"]
 (#object[clojure.core$_STAR_ 0x6fc90beb "clojure.core$_STAR_@6fc90beb"]
  (#object[clojure.core$_STAR_ 0x6fc90beb "clojure.core$_STAR_@6fc90beb"]
   :x
   :x)
  (#object[clojure.core$_PLUS_ 0x1b00ba1a "clojure.core$_PLUS_@1b00ba1a"]
   :x
   :x))
 (#object[clojure.core$_PLUS_ 0x1b00ba1a "clojure.core$_PLUS_@1b00ba1a"]
  (#object[clojure.core$_PLUS_ 0x1b00ba1a "clojure.core$_PLUS_@1b00ba1a"]
   :x
   :x)
  (#object[clojure.core$_PLUS_ 0x1b00ba1a "clojure.core$_PLUS_@1b00ba1a"]
   :x
   :x)))

, which is equivalent to

`(~* (~* (~* ~:x ~:x) (~+ ~:x ~:x)) (~+ (~+ ~:x ~:x) (~+ ~:x ~:x)))

(def ast `(~* (~* (~* ~:x ~:x) (~+ ~:x ~:x)) (~+ (~+ ~:x ~:x) (~+ ~:x ~:x))))

We can write fn which directly evaluates this AST as:

(defn ast-fn
  [{x :x}]
  (* (* (* x x) (+ x x)) (+ (+ x x) (+ x x))))

=> (ast-fn {:x 3})
648

We have two methods for creating function based on AST, one with help of apply and map, and the other with help of comp and juxt:

(defn tree-apply
  "((+ :x :x) in) => (apply + [(:x in) (:x in))]"
  ([tree] (fn [in] (tree-apply tree in)))
  ([tree in]
    (if (sequential? tree)
    (apply (first tree) (map #(tree-apply % in) (rest tree)))
    (tree in))))
#'gpr.symb-reg/tree-apply

=> (defn tree-comp
     "(+ :x :x) => (comp (partial apply +) (juxt :x :x))"
     [tree]
     (if (sequential? tree)
       (comp (partial apply (first tree)) (apply juxt (map tree-comp (rest tree))))
       tree))
#'gpr.symb-reg/tree-comp


=> ((tree-apply ast) {:x 3})
648

=> ((tree-comp ast) {:x 3})
648

With timing fn we measure time for executing functions over test cases:

=> (defn timing
     [f interval]
     (let [values (into [] (map (fn[x] {:x x})) interval)]
       (time (into [] (map f) values)))
       true)

=> (timing ast-fn (range -10 10 0.0001))
"Elapsed time: 37.184583 msecs"
true

=> (timing (tree-comp ast) (range -10 10 0.0001))
"Elapsed time: 328.961435 msecs"
true

=> (timing (tree-apply ast) (range -10 10 0.0001))
"Elapsed time: 829.483138 msecs"
true

As you can see there is huge difference in performance between direct function (ast-fn), tree-comp generated function and tree-apply generated function.

Is there some better way?

Edit: madstap's answer looks quite promising. I made some modifications on his solution (terminals could be also some other functions, not just keyword, like constant function which constantly returns value, regardless of input):

(defn c [v] (fn [_] v))
(def c1 (c 1))

(defmacro full-growth-macro
     "Creates individual by full growth method: root and intermediate nodes are
      randomly selected from non-terminals Ns,
      leaves at depth depth are randomly selected from terminals Ts"
     [Ns Ts arity-fn depth]
     (let [tree (full-growth Ns Ts arity-fn depth)
           val-map (gensym)
           ast2f (fn ast2f [ast] (if (sequential? ast)
                   (list* (first ast) (map #(ast2f %1) (rest ast)))
                   (list ast val-map)))
           new-tree (ast2f tree)]
       `{:ast '~tree
         :fn (fn [~val-map] ~new-tree)}))

Now, creating ast-m (with use of constant c1 as terminal) and associated ast-m-fn:

=> (def ast-m (full-growth-macro [+ *] [:x c1] {+ 2 * 2} 3))
#'gpr.symb-reg/ast-m
=> ast-m
{:fn
 #object[gpr.symb_reg$fn__20851 0x31802c12 "gpr.symb_reg$fn__20851@31802c12"],
 :ast
 (+
  (* (+ :x :x) (+ :x c1))
  (* (* c1 c1) (* :x c1)))}
=> (defn ast-m-fn
     [{x :x}]
     (+
     (* (+ x x) (+ x 1))
     (* (* 1 1) (* x 1))))
#'gpr.symb-reg/ast-m-fn

Timing looks very similar:

=> (timing (:fn ast-m) (range -10 10 0.0001))
"Elapsed time: 58.478611 msecs"
true
=> (timing (:fn ast-m) (range -10 10 0.0001))
"Elapsed time: 53.495922 msecs"
true
=> (timing ast-m-fn (range -10 10 0.0001))
"Elapsed time: 74.412357 msecs"
true
=> (timing ast-m-fn (range -10 10 0.0001))
"Elapsed time: 59.556227 msecs"
true
like image 207
Bojan Horvat Avatar asked Oct 29 '22 01:10

Bojan Horvat


1 Answers

Use a macro to write the equivalent of ast-fn.

(ns foo.core
  (:require
   [clojure.walk :as walk]))

(defmacro ast-macro [tree]
  (let [val-map (gensym)
        new-tree (walk/postwalk (fn [x]
                                  (if (keyword? x)
                                    (list val-map x)
                                    x))
                                (eval tree))]
    `(fn [~val-map] ~new-tree)))

On my machine this comes close to the perf of ast-fn. 45 msecs to 50 msecs. It does more lookups, but that can be fixed with some extra tinkering.

Edit: I thought some more about this. evaling the argument at macroexpansion time will limit how you can use this (the argument can't be a local). Making full-growth a macro could work better. Like amalloy says, it's all about what you want to do at runtime vs macroexpansion time.

(defmacro full-growth-macro
  "Creates individual by full growth method: root and intermediate nodes are
   randomly selected from non-terminals Ns,
   leaves at depth depth are randomly selected from terminals Ts"
  [Ns Ts arity-fn depth]
  (let [tree (full-growth Ns Ts arity-fn depth)
        val-map (gensym)
        new-tree (walk/postwalk (fn [x]
                                  (if (keyword? x)
                                    (list val-map x)
                                    x))
                                tree)]
    `{:ast '~tree
      :fn (fn [~val-map] ~new-tree)}))
like image 89
madstap Avatar answered Jan 02 '23 19:01

madstap