I came across this while tuning some performance sensitive code:
user> (use 'criterium.core)
nil
user> (def n (into {} (for [i (range 20000) :let [k (keyword (str i))]] [k {k k}])))
#'user/n
user> (quick-bench (-> n :1 :1))
WARNING: Final GC required 32.5115186521176 % of runtime
Evaluation count : 15509754 in 6 samples of 2584959 calls.
Execution time mean : 36.256135 ns
Execution time std-deviation : 1.076403 ns
Execution time lower quantile : 35.120871 ns ( 2.5%)
Execution time upper quantile : 37.470993 ns (97.5%)
Overhead used : 1.755171 ns
nil
user> (quick-bench (get-in n [:1 :1]))
WARNING: Final GC required 33.11057826481865 % of runtime
Evaluation count : 7681728 in 6 samples of 1280288 calls.
Execution time mean : 81.023429 ns
Execution time std-deviation : 3.244516 ns
Execution time lower quantile : 78.220643 ns ( 2.5%)
Execution time upper quantile : 85.906898 ns (97.5%)
Overhead used : 1.755171 ns
nil
It's unintuitive to me that get-in
is more than twice as slow as threading through get
s here as get-in
seems to be defined as the better abstraction for this sort of thing.
Does anyone have any insight into why this is the case (both technically and philosophically)?
Nested maps are very commonly used in Clojure programs. This is a good thing. But there can be occasions where nested map operations such as assoc-in
and get-in
may be improved by unrolling. (get :a (get :b (get :c (get :d m)))
is not the same thing as (get-in m [:d :c :b :a])
in terms of the byte code produced. The byte code of the later results in worse execution time.
Note that Clojure has some pending patches http://dev.clojure.org/jira/browse/CLJ-1656 related to this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With