Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are Clojure's core.reducers faster than lazy collection functions

In many resources on Reducers (like the canonical blog post by Rich Hickey), it's claimed that reducers are faster than the regular collection functions ((map ... (filter ...)) etc.) because there is less overhead.

What is the extra overhead that's avoided? IIUC even the lazy collection functions end up walking the original sequence just once. Is the difference in the details of how intermediate results are computed?

Pointers to relevant places in the implementation of Clojure that help understand the difference will be most helpful too

like image 611
zlatanski Avatar asked Mar 15 '16 12:03

zlatanski


1 Answers

I think one key insight is in the following passage from the original blog post:

(require '[clojure.core.reducers :as r])
(reduce + (r/filter even? (r/map inc [1 1 1 2])))
;=> 6

That should look familiar - it's the same named functions, applied in the same order, with the same arguments, producing the same result as the Clojure's seq-based fns. The difference is that, reduce being eager, and these reducers fns being out of the seq game, there's no per-step allocation overhead, so it's faster. Laziness is great when you need it, but when you don't you shouldn't have to pay for it.

The realisation of a lazy sequence comes with a (linear) allocation cost: every time another element from the lazy seq is realised, the rest of the seq is stored in a new thunk, and the representation of such a ‘thunk’ is a new clojure.lang.LazySeq object.

I believe those LazySeq objects are the allocation overhead referred to in the quote. With reducers there is no gradual realisation of lazy seq elements and therefore no instantiation of LazySeq thunks at all.

like image 124
glts Avatar answered Jun 02 '23 16:06

glts