Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Advice/discussion on anonymous "self referential" data structures

Apologies for any mistaken terminology--I'm quite new to computer science, and I pretty much only know Clojure (but I guess I'd say I know it pretty well).

So, I haven't done a ton of research on this, but I've sometimes found it useful when writing Clojure code to be able to refer to some "intermediate version of whatever data structure I'm in" from within that data structure (a lot like in a let). Quick examples:

=> (self-ish {:a 10
              :b (inc (this :a))
              :c (count (vals this))})
=> {:a 10, :b 11, :c 3}
=> (self-ish ["a" "b" (reduce str this)])
=> ["a" "b" "ab"]
//Works in any nested bits too
=> (self-ish [1 2 3 [4 5 (first this)] 6 [7 [8 (cons (second this) (nth this 3))]]])
=> [1 2 3 [4 5 1] 6 [7 [8 (2 4 5 1)]]]

The idea is that the structure builds itself up incrementally, and at any stage has the ability to refer to the current intermediate structure as this. Here's the code for my current implementation:

//Random straightforward but helpful definitions
(defn map-entry? [obj]
  (instance? clojure.lang.AMapEntry obj))
(def Map clojure.lang.IPersistentMap)
(def Vector clojure.lang.IPersistentVector)
(def List clojure.lang.IPersistentList)
(def Set clojure.lang.IPersistentSet)

(defn append
  [x coll]
  (if-not coll x
    (condp instance? coll
      Map (if (empty? x) coll
            (assoc coll (first x) (second x)))
      Vector (conj coll x)
      Set (conj coll x)
      List (apply list (concat coll [x]))
      (concat coll [x]))))

(defn build-this
  [acc-stack acc]
  (->> (cons acc acc-stack)
       (drop-while list?)
       (drop-while (every-pred empty? identity))
       (reduce append)))

(defn self-indulge
  [acc-stack acc form]
  ;//Un-comment the following to see it print intermediate stages of processing
  #_(println "this:" (build-this acc-stack acc) "\n  at:" form)
  (append (cond
            (coll? form) (reduce (partial self-indulge (cons acc acc-stack))
                                 (if (map-entry? form) []
                                   (empty form))
                                 form)
            (= (quote this) form) (build-this acc-stack acc)
            :else form)
          acc))

(defmacro self-ish
  [form]
  (self-indulge () nil form))

The append function appends an item onto a collection and returns the same type of collection. The self-indulge function has a standard reduce-like accumulator, which just builds up elements of form. It also has a accumulator stack, which gets longer every time self-indulge recurs upon itself. The point of this is to keep track of other "higher up" accumulators, so that this will be the entire structure, not just a local piece. The self-ish macro just nicely wraps up self-indulge (which calls itself using partial, so it can't wear the macro pants).

Edit: example use case To me, this macro is about trying to increase code readability, not truly extending functionality. Where I have found this useful is in cases where I have hand-written structures with partially redundant data--or maybe "dependent" is a better word. It can be easier to read the code and see what different parts of the data structure hold, and it can also be useful if I modify data values in one part of a structure and want that change to be reflected in other parts. For example:

=> (self-ish {:favorite-books (list "Crime and Punishment" "Mrs. Dalloway")
              :favorite-things (list* "Ice Cream" "Hammocks" (this :favorite-books)})
=> {:favorite-things ("Ice Cream" "Hammocks" "Crime and Punishment" "Mrs. Dalloway"),
    :favorite-books ("Crime and Punishment" "Mrs. Dalloway")}

It could also be useful in times where one might really like to include something baked into the data, as opposed to derived on the fly using some function. These cases are probably much rarer, and I think it would be a bad idea to tangle the data unnecessarily when you could just have nice clean functions manipulating it.

My main questions:

  1. Is this actually useful, or would the ambiguity/complexity incurred be too much? I imagine I'm not alone in wanting/using this type of macro. What are others' experiences here? Do you use something like this? Have you found better workarounds? Are there reasons something like this isn't in any Clojure libraries? Or is there something that I haven't yet seen?
  2. Are there better naming conventions I might use--as opposed to self-ish and this? For example, maybe this is too loaded with OOP meaning, I'm not sure, I am basically only familiar with Clojure.
  3. I am pretty new to computer science, are there accessible and informative resources related to this type of thing--I guess I would call it anonymous self referential (maybe reflexive is a better word?) data structures? I haven't found anything both approachable and informative yet.
  4. Is there a better way to write the self-ish macro? Above, I've included my current version of it, but I can't shake the feeling there may be a simpler way.
  5. I have various questions about what might be the "wisest" implementation details.

    • Traversal: Should it be breadth first or depth first? If depth first, preorder, postorder, or inorder? Right now, I believe it's depth first preorder, which makes sense to me, but maybe it has some drawbacks I haven't noticed.
    • Order problems: (See here for a related previous question of mine) Within {} (i.e. maps written by hand) it's impossible to maintain order properly (above 8 map entries) without using array-map or sorted-map--in other words, above 8 map entries, {} usage is unsafe. Maybe instead of hand-written order, the macro could do some fancy magic to process items in some "ideal" order? Or perhaps it would be better to wrap all maps within (array-map ...) instead of the eye-pleasing {}?

      //Showing maps with 9 entries failing
      => (self-ish {:a 1
                    :b (inc (this :a))
                    :c (inc (this :b))
                    :d (inc (this :c))
                    :e (inc (this :d))
                    :f (inc (this :e))
                    :g (inc (this :f))
                    :h (inc (this :g))
                    :i (inc (this :h))})
      => NullPointerException   clojure.lang.Numbers.ops (Numbers.java:942)
      //8 works fine
      => (self-ish {:a 1
                    :b (inc (this :a))
                    :c (inc (this :b))
                    :d (inc (this :c))
                    :e (inc (this :d))
                    :f (inc (this :e))
                    :g (inc (this :f))
                    :h (inc (this :g))})
      => {:h 8, :g 7, :f 6, :e 5, :d 4, :c 3, :b 2, :a 1}
      
    • Serial: As I've written it, the macro avoids infinite recursion by dealing with its elements serially, similar to let, but that does produce potentially odd behavior. For example, in my above quick example, (reduce str this) returns "ab" because this is ["a" "b"] at that step. Maybe it would be useful sometimes to create some sort of infinite lazy sequence instead? If so, how might that be implemented?

    • Map entries: Right now, map entries are treated like vectors, but because of how this could be invoked at any intermediate step, it is totally possible to get a nil value from a key that has "not yet" been assigned a value. That is why in my first quick example, :c ended up mapped to 3--because intermediately there was a nil corresponding to :c, and that got counted as well. Do you think this warrants fixing?
    • Non-macro utility: It would be trivial to use just self-indulge outside of the macro context, but could this ever be useful?

Thanks for reading, any help is appreciated :)

like image 806
Omri Bernstein Avatar asked Jul 25 '12 13:07

Omri Bernstein


2 Answers

This approach "feels" a bit wrong to me, though I'm not quite sure why. Maybe I don't like the idea of map construction being dependent on order....

Having said that it's a pretty easy macro to write, you effectively want something that expands to:

(let [this {}
      this (assoc this :a 1)
      this (assoc this :b (+ (this :a) 3))]
  this)

Hence an appropriate macro would be something like (for the map case):

(defmacro self-ish [bindings]
  `(let [~'this {}
         ~@(mapcat 
           #(do `(~'this (assoc ~'this ~@%)) )    
           (partition 2 bindings) )]
    ~'this))

(self-ish [:a 1
           :b (+ (this :a) 3)])
=> {:b 4, :a 1}

Note that I'm made the binding form a vector as a map binding form is unordered.

Still not sure how much I like this idiom. My preferred way is usually to define a structure with a let form and give meaningful names to interim calculations e.g.:

(let [meaningful-foo (something)
      meaningful-bar (something-else)]
   {:foo meaningful-foo
    :bar meaningful-bar
    :baz (some-calculation meaningful-foo meaningful-bar)})
like image 192
mikera Avatar answered Nov 11 '22 16:11

mikera


in scheme this is done with (letrec ...) which lets you refer to the name of the data structure inside the structure itself. so if you want to define your own way of doing this it might make more sense to implement that. you could do it using the tricks with references described in the answers to Is it possible to create circular references in Clojure?

one advantage of letrec is that it has a user-specified name (if your macro is nested then this is shadowed).

[edited to remove comment on types as i don't understand your macro.]

[update] also, you may be interested in the discussion of anaphora in joy of clojure section 8.5.1

like image 29
andrew cooke Avatar answered Nov 11 '22 17:11

andrew cooke