Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does clojure's group-by not always maintain order?

Why does (group-by identity (range 1 50)) return results like

{32 [32], 1 [1], 33 [33], 2 [2], 34 [34], 3 [3], 35 [35]...

Is it multi-threading related? Is there any way around it?

...and does it break it's contract?

Returns a map of the elements of coll keyed by the result of f on each element. The value at each key will be a vector of the corresponding elements, in the order they appeared in coll.

like image 518
status203 Avatar asked Feb 06 '12 13:02

status203


1 Answers

Try entering (type (group-by identity (range 1 50))) in your REPL. You can see that the result is actually a hash map (of class clojure.lang.PersistentHashMap). This means that it's unordered: in principle the REPL could output the printed map literal in any order of key/value pairs.

The actual reason why it's printed the way it's printed has to do with Clojure's implementation of a hash map -- in terms of data structures, it's actually a wide tree where each node can have up to 32 children (hence the initial 32 in your output; recall that Clojure vectors and maps are often cited to have lookup cost of O(log32N)). This blog article has a good summary.

And no, it doesn't violate the contract of group-by. The contract only specifies the ordering of the map's values' elements, not the ordering of the map per se.

like image 151
Daniel Janus Avatar answered Sep 28 '22 09:09

Daniel Janus