I need to write some stuff to a file based on elements of a Clojure collection, which I can do--but I ran into something that confuses me. It's probably because I don't fully understand the time macro, but when I do the following:
=> (def nums (take 100000 (repeatedly #(str (rand-int 1000) " "))))
(defn out1 [nums] (doseq [n nums] (spit "blah1.txt" n :append true)))
(defn out2 [nums] (map #(spit "blah2.txt" % :append true) nums))
#'test.core/nums
#'test.core/out1
#'test.core/out2
=> (time (out1 nums))
"Elapsed time: 19133.247 msecs"
nil
=> (time (out2 nums))
"Elapsed time: 0.209 msecs"
(nil nil nil nil ... )
the implementation using map (out2) runs significantly faster. However, when I go to the folder and watch the file, it continues to do the writing after the Elapsed time is given and the (nil ...) output waits until it's done writing to show also. That leads me to believe that they're both actually taking the same time.
So, what's the difference between using doseq and map in this situation? And which way would be better overall? Thanks
doseq is eager (not lazy) and does all the work when you call it. map is lazy and immediately returns a lazy sequence representing the work that will happen when the result is read.
so the map is doing the work when the repl prints the result of the map (all the nils) not in the part you are timing. to fix this put a call to doall
or dorun
around the call to map
.
(time (doall (out2 nums)))
the more significant bug is that is you don't print the result (or otherwise consume it) then the contents will not be written to the file at all. In general for purely side effecting operations doseq
is likely the better choice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With