(doseq [e coll1]
(myfunc e))
is very fast, if all you care about are side effects. What if I want myfunc
to take elements from multiple collections "in parallel", i.e. apply myfunc
to the first elements of each collection, then to all of the second elements, then to all of the third elements, etc.? Note that this is as much a question about the functionality of for
as doseq
, but if one wants a sequence as output, map
will do what's needed, so for
isn't necessary.
(doseq [e1 coll1
e2 coll2]
(myfunc e1 e2))
will instead apply myfunc
to all possible combinations of elements from the two collections. If I know in advance what the elements of the collection will be, I could use a :when
test to combine only certain elements, but suppose that I don't know that?
One solution is to create ntuples to avoid the Cartesian product, but that is time consuming, removing the speed advantage of using doseq
in the first place:
(let [argvecs (map vector coll1 coll2)] ; seq of ntuples of interleaved vals
(doseq [args argvecs]
(apply myfunc args))))
(This can be about 8X slower than a single-collection doseq
. See times for domap1
and domap17
at the end of this question.)
If you want to avoid the overhead of creating tuples with map, all you can do is write it yourself, as a loop/recur that walks each collection manually. But really, you'll still end up needing to create a tuple so that you can (apply f args)
, where args
is the nth item of each collection. You'll save a few cons cells by not making a list of such tuples, but that's all. A lot of the expense of variadic functions like this is calling apply
, and building the lists to do that with. You can avoid that by writing a 2-arity version of your doseq-sibling, and a 3-arity, and... But the n-arity version will always be slower.
If it's speed you're after you should turn on reflection-warnings and maybe check out the loop-primitive (recuring with (rest coll1) (rest coll2))...
also checkout Clojure is still fast and the performance testing framework Criterium to make sure you are measuring the right thing.
Use (dorun (map f coll1 coll2 ..))
or (dorun (map apply f colls))
.
The more you ask of f
, the longer it is going to take.
(def a (atom 0)
(defn f [& args] (swap! a #(apply + % args)))
(def N 10000)
On a single collection use doseq
. There is avoidable overhead from the lazy-seq structure.
(bench (doseq [e (range N)] (f e)))
Execution time mean : 4.959713 ms
(bench (dorun (map f (range N))))
Execution time mean : 5.669721 ms
On two collections, note f
has to add twice instead of once, so I would expect this to take twice as long. Note now both versions have some structural overhead.
(bench (let [argvecs (map vector (range N) (range N))]
(doseq [e argvecs] (apply f e))))
Execution time mean : 11.876843 ms
(bench (dorun (map f (range N) (range N))))
Execution time mean : 11.145435 ms
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With