Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is there no significant speedup using reducers in this example?

(require '[clojure.core.reducers :as r])

(def data (into [] (take 10000000 (repeatedly #(rand-int 1000)))))

(defn frequencies [coll]
  (reduce (fn [counts x]
    (merge-with + counts {x 1}))
    {} coll))

(defn pfrequencies [coll]
  (r/reduce (fn [counts x]
    (merge-with + counts {x 1}))
    {} coll))


user=> (time (do (frequencies data) nil))
"Elapsed time: 29697.183 msecs"

user=> (time (do (pfrequencies data) nil))
"Elapsed time: 25273.794 msecs"

user=> (time (do (frequencies data) nil))
"Elapsed time: 25384.086 msecs"

user=> (time (do (pfrequencies data) nil))
"Elapsed time: 25778.502 msecs"

And who can show me an example with significant speedup?

I'm running on Mac OSX 10.7.5 with Java 1.7 on an Intel Core i7 (2 cores, http://ark.intel.com/products/54617).

like image 247
Michiel Borkent Avatar asked May 20 '13 16:05

Michiel Borkent


2 Answers

Some serious food for thought in the answers here. In this specific case maps should not be needed, since the result domain can be easily predicted and put in a vector where the index can be used. So, a naive implementation of a naive problem would be something like:

(defn freqs
  [coll]
  (reduce (fn [counts x] (assoc counts x (inc (get counts x))))
          (vec (int-array 1000 0))
          coll))

(defn rfreqs
     [coll]
     (r/fold
       (fn combinef
         ([] (vec (int-array 1000 0)))
         ([& cols] (apply mapv + cols)))
       (fn reducef
         [counts x] (assoc counts x (inc (get counts x))))
       coll))

Here the combinef would be a simple map addition over the 1000 columns of the resulting collections, which should be negligible.

This gives the reducer version a speedup of about 2-3x over the normal version, especially on bigger (10x-100x) datasets. Some twiddling with the partition size of r/fold (optional 'n' parameter) can be done as finetuning. Seemed optimal to use (* 16 1024) with a data size of 1E8 (need 6GB JVM at least).

You could even use transients in both versions, but I didn't notice much improvements.

I know this version isn't appropriate for generic usage, but it might show the speed improvement without the hash management overhead.

like image 126
NielsK Avatar answered Sep 28 '22 07:09

NielsK


A fold version of your frequencies function would look something like

(defn pfrequencies [coll] 
  (r/fold 
    (fn combinef
      ([] {})
      ([x y] (merge-with + x y)))
    (fn reducef
      ([counts x] (merge-with + counts {x 1})))
    coll))

On 2 cores, it will likely be much slower than clojure.core/frequencies which uses transients. At least on 4 cores, it is faster (2x) than the first implementation, but still slower than clojure.core/frequencies.

You might also experiment with

(defn p2frequencies [coll]
  (apply merge-with + (pmap clojure.core/frequencies (partition-all 512 coll))))
like image 26
A. Webb Avatar answered Sep 28 '22 07:09

A. Webb