I need to make 200 or so HTTP requests. I want them to run in parallel, or batches, and I'm not sure where to start for doing this in Clojure. pmap
appears to have the effect I want, for example, using http.async.client:
(defn get-json [url]
(with-open [client (http/create-client)]
(let [resp (http/GET client url)]
(try
(println 1)
(http/string (http/await resp))
(println "********DONE*********")
nil
(catch Exception e (println e) {})))))
music.core=> (pmap get-json [url url2])
1
1
********DONE*********
********DONE*********
(nil nil)
But I can't prove that the requests are actually executing in parallel. Do I need to call into the JVM's Thread APIs? I'm searching around and coming up with other libraries like Netty, Lamina, Aleph - should I be using one of these? Please just point me in the right direction for learning about the best practice/simplest solution.
Ideally you don't want to tie up a thread waiting for the result of each http request, so pmap
or other thread-based approaches aren't really a good idea.
What you really want to do is:
My suggested approach is to use http-kit to fire off all the asynchronous requests at once, producing a sequence of promises. You then just need to dereference all these promises in a single thread, which will block the thread until all results are returned.
Something like:
(require '[org.httpkit.client :as http])
(let [urls (repeat 100 "http://google.com") ;; insert your URLs here
promises (doall (map http/get urls))
results (doall (map deref promises))]
#_do_stuff_with_results
(first results))
What you're describing is a perfectly good use of pmap
and I'd approach it in similar fashion.
As far as 'proving' that it runs in parallel, you have to trust that each iteration of pmap
runs the function in a new thread. However a simple way to be certain is simply print the thread id as a sanity check:
user=> (defn thread-id [_] (.getId (Thread/currentThread)))
user=> (pmap thread-id [1 2 3])
(53 11 56)
As the thread numbers are in fact different - meaning clojure is creating a new thread each time - you can safely trust the JVM will run your code in parallel.
Also have a look at other parallel functions such as pvalues and pcalls. They give you different semantics and might be the right answer depending on the problem at hand.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With