Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Build a tree from an vector in Clojure

Tags:

tree

clojure

I'm working on my first Clojure program. I'm having some issues figuring out how to build a tree based on an input that looks like this:

["A B" "A C" "C D" "D E" "A F" "F G"]

And the output should be something like this:

'(A (B) (C (D (E)) (F (G)))

I'm not sure how to start doing this. When it comes to an imperative programming, for example, I'd use nested loops to find whether this relation already exists or not and when it doesn't, I'd find the parent element and append the child to it. But as far as I know, functional programming would use another approach, one that I would recursively walk all the elements in a vector and in lieu of changing the existent tree I'd built a new one.

I'm not sure whether this works or not, but I have a function that looks like this:

(defn build-tree
  [func tree parent child]
  (map (fn [i] (if (seq? i)
                 build-tree
                 (func i parent child))))) 

I'm sure the issue comes from my unfamiliarity with Clojure and functional programming and was hoping someone could explain the best way to use recursion and built this tree.

like image 893
user1067437 Avatar asked Dec 04 '22 23:12

user1067437


2 Answers

You are possibly going to receive a few answers much shorter than mine below. I've decided to kind of teach you fish - instead of showing a short program I am going to take you through a systematic way of solving this class of problems. You will have to fill in a few blanks, including converting your data into a more palatable form. These are minor points that will only distract.

The first thing you need to do is break down your output into construction steps. Those steps should match the abstract type your output is representing. In your case the output's concrete type is a list, but it actually represents a tree. These are different abstract types - a list has a head and a tail, that's it, but the tree has nodes, which may have children. You need to think in terms of abstract constructors that build different types of nodes, not about the accidental specific structure you happened to choose to represent your abstract type - a list in your case.

So imagine that you have two constructors that look like

(defn ->branch
  [id kids]
  ...)

(defn ->leaf
  [id]
  ...)

It is of course assumed that every kid is a tree - that is, either a branch or a leaf. In other words, every kid should be a result of either ->branch or ->leaf.

So the only way to build any tree is to have a nested chain of constructor invocations. That's what I mean by "breaking down the output into construction steps." In your case, it means the following chain:

(->branch :A [(->leaf :B) (->branch :C [(->branch :D (->leaf :E))]) (->branch :F [(->leaf :G)])])

(I am going to use keywords instead of symbols for node ids, otherwise we will get bogged down in binding/quoting details.)

Stop here for a second to feel the difference between functional and imperative styles. You outlined the imperative style yourself - you think in terms of updating a tree by finding the right place and adding a new node there. In functional style, you think in terms of creating a value, not updating another value. (Of course nothing technically prevents you from doing the same in an imperative language - like using a nested call to constructors, or using a static factory this way, it's just not something that comes about naturally.)

To get to your specific tree representation as a list, you just need to fill in the implementation of ->branch and ->leaf above, which is very simple. (Left as an exercise for you.)

Now back to building the tree. The task of coding a function that builds a tree is actually creating a chain of constructor calls from the input data. So you need to understand two things:

  • When to call which constructor (here, when to call ->branch and when call ->leaf)
  • How to get parameters for the constructors (id for either, and kids for ->branch)

To start, how do we know what node to begin? That depends on what the problem is, but let's assume that it is given us as a parameter. So we assume that we are building the following function:

(defn ->tree
  [adj-list node]
  ...)

The adj-list is your input - which is an adjacency list even if you try to disguise it as a list of strings, and we are going to treat it this way (like @SamEstep suggested in the comments.) Left as an exercise for you to convert your form of input into an adjacency list.

So it should be clear what we use as an id in our constructors - the node. But is it ->branch or ->leaf? Well, the answer depends on whether we have direct descendants of node in the adj-list, so apparently we need a function

(defn descendants
  [adj-list node]
  ...)

which returns a list (possibly empty) of ids of direct descendants of the node. (Left as an exercise.) So we can decide whether to call ->branch or ->leaf depending on whether that list is empty or not, like

(if-let [kid-ids (descendants node)]
   (->branch node ...)
   (->leaf node))

Ok, then we need to supply these ... to ->branch. Are they kid-ids? No, they must be trees, not ids, either branches or leaves themselves. In other words, we need to first call ->tree on them. See? We hit the recursion point, and it came about naturally (or so I hope). In Clojure, calling a function on every element of a sequence is done by map, which returns a sequence of results, which is exactly what we need:

(if-let [kid-ids (descendants adj-list node)]
   (->branch node (map ->tree kid-ids)
   (->leaf node))

except that ->tree expects an additional parameter, adj-list. We can use an anonymous function, or we can use partial. We will use a partial with a let binding, which is the cleanest way of doing it:

(let [->tree' (partial ->tree adj-list)]
   (if-let [kid-ids (descendants adj-list node)]
      (->branch node (map ->tree' kid-ids))
      (->leaf node)))

This is it. Let's put it together:

(defn ->tree
  [adj-list node]
  (let [->tree' (partial ->tree adj-list)]
     (if-let [kid-ids (descendants adj-list node)]
       (->branch node (map ->tree' kid-ids))
       (->leaf node))))

The result:

(def adj-list [[:A :B] [:A :C] [:C :D] [:D :E] [:A :F] [:F :G]])

(->tree adj-list :A)
;; => (:A (:B) (:C (:D (:E))) (:F (:G)))

Let's sum up:

  • Look at your data in abstract terms. Create constructors, and use them for building your output.
  • To build output means to create a chain of constructor calls.
  • Which means you need to map your input's shape to output constructors and their parameters.
  • In many cases, the recursion will emerge by itself naturally at this step.

After you have done it, you can optimize and short-circuit to your heart's content (like if you look closely, you can do away with ->leaf.) But don't do it prematurely.

like image 169
Yuri Steinschreiber Avatar answered Feb 10 '23 04:02

Yuri Steinschreiber


Your output looks like an associative structure, is there a reason you're using a list instead of a map here?

Also is the ordering guaranteed to match the tree structure as in your example? I'll assume not. And I think it's easier to not use recursion here...

So let's parse the input into an adjacency list, like:

(reduce 
(fn[g edge] 
 (let [[from to]] (map keyword (str/split " " edge))
  (update g from #(conj % to))
 {} 
["A B" "B C"])

Which should output: {A [B C F],,,}

which can be used to create your tree as a list if you desire.

I can't test this because I'm in mobile at the moment so forgive any mistakes. :)

like image 23
user2946753 Avatar answered Feb 10 '23 03:02

user2946753