Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Heap's algorithm in Clojure (can it be implemented efficiently?)

Heap's algorithm enumerates the permutations of an array. Wikipedia's article on the algorithm says that Robert Sedgewick concluded the algorithm was ``at that time the most effective algorithm for generating permutations by computer,'' so naturally it would be fun to try to implement.

The algorithm is all about making a succession of swaps within a mutable array, so I was looking at implementing this in Clojure, whose sequences are immutable. I put the following together, avoiding mutability completely:

(defn swap [a i j]
  (assoc a j (a i) i (a j)))

(defn generate-permutations [v n]
  (if (zero? n)
    ();(println (apply str a));Comment out to time just the code, not the print
    (loop [i 0 a v]
      (if (<= i n)
        (do
          (generate-permutations a (dec n))
          (recur (inc i) (swap a (if (even? n) i 0) n)))))))

(if (not= (count *command-line-args*) 1)
  (do (println "Exactly one argument is required") (System/exit 1))
  (let [word (-> *command-line-args* first vec)]
    (time (generate-permutations word (dec (count word))))))

For an 11-character input string, the algorithm runs (on my computer) in 7.3 seconds (averaged over 10 runs).

The equivalent Java program, using character arrays, runs in 0.24 seconds.

So I would like to make the Clojure code faster. I used a Java array with type hinting. This is what I tried:

(defn generate-permutations [^chars a n]
  (if (zero? n)
    ();(println (apply str a))
    (doseq [i (range 0 (inc n))]
      (generate-permutations a (dec n))
      (let [j (if (even? n) i 0) oldn (aget a n) oldj (aget a j)]
        (aset-char a n oldj) (aset-char a j oldn)))))

(if (not= (count *command-line-args*) 1)
  (do
    (println "Exactly one argument is required")
    (System/exit 1))
  (let [word (-> *command-line-args* first vec char-array)]
    (time (generate-permutations word (dec (count word))))))

Well, it's slower. Now it averages 9.1 seconds for the 11-character array (even with the type hint).

I understand mutable arrays are not the Clojure way, but is there any way to approach the performance of Java for this algorithm?

like image 210
Ray Toal Avatar asked Nov 01 '15 18:11

Ray Toal


People also ask

What are the applications of binary heap?

Some of the most important applications of the binary heap are: Used in operating systems for scheduling jobs on a priority basis. Used in graph algorithms like Dijkstra’s Algorithm (to find the shortest path), Minimum Spanning Tree, and Prim's Algorithm. Used to perform heap sort.

What is a heap in Java?

A heap is a complete binary tree, in which all levels except the last, must be completely filled, and left-justified, which means the left subtree is filled before the right subtree. The complete binary tree is always balanced in order to guarantee logarithmic performance. The most common misconception about heaps is that they are sorted.

What is the advantage of using heap data structure?

Used to perform heap sort. Because heaps are relatively flexible data structures, memory allocation is usually very quick. In comparison to other data structures, element insertion and deletion are also quite fast. The smallest and largest element can also be found very quickly as it is generally stored at the root node.

What is a min heap?

In a min heap, the root node is the smallest of all the other nodes in the tree, and the value of every parent node is always smaller than or equal to the value of the child nodes.


1 Answers

It's not so much that Clojure is entirely about avoiding mutable state. It's that Clojure has very strong opinions on when it should be used.

In this case, I'd highly recommend finding a way to rewrite your algorithm using transients, as they're specifically designed to save time by avoiding the reallocation of memory and allowing a collection to mutable locally so long as the reference to the collection never leaves the function in which it was created. I recently managed to cut a heavily memory intensive operation's time by nearly 10x by using them.

This article explains transients fairly well!

http://hypirion.com/musings/understanding-clojure-transients

Also, you may want to look into rewriting your loop structure in a way that allows you to use recur to recursively call generatePermutations rather than using the whole name. You'll likely get a performance boost, and it'd tax the stack a lot less.

I hope this helps.

like image 195
sleepysilverdoor Avatar answered Sep 29 '22 17:09

sleepysilverdoor