I'm fetching thousands of entities from an API one at a time using http requests. As next step in the pipeline I want to shovel all of them into a database.
(->> ids
(pmap fetch-entity)
(pmap store-entity)
(doall))
fetch-entity
expects a String
id and tries to retrieve an entity using an http request and either returns a Map
or throws an exception (e.g. because of a timeout).
store-entity
expects a Map
and tries to store it in a database. It possibly throws an exception (e.g. if the Map
doesn't match the database schema or if it didn't receive a Map
at all).
My first "solution" was to write wrapper functions fetch-entity'
and store-entity'
to catch exceptions of their respective original functions.
fetch-entity'
returns its input on failure, basically passing along a String
id if the http request failed. This ensures that the whole pipeline keeps on trucking.
store-entity'
checks the type of its argument. If the argument is a Map
(fetch entity was successful and returned a Map
) it attempts to store it in the database.
If the attempt of storing to the database throws an exception or if store-entity'
got passed a String
(id) instead of a Map
it will conj
to an external Vector
of error_ids
.
This way I can later use error_ids
to figure out how often there was a failure and which ids were affected.
It doesn't feel like the above is a sensible way to achieve what I'm trying to do. For example the way I wrote store-entity'
complects the function with the previous pipeline step (fetch-entity'
) because it behaves differently based on whether the previous pipeline step was successful or not.
Also having store-entity'
be aware of an external Vector
called error_ids
does not feel right at all.
Is there an idiomatic way to handle these kinds of situations where you have multiple pipeline steps where some of them can throw exceptions (e.g. because they are I/O) where I can't easily use predicates to make sure the function will behave predictable and where I don't want to disturb the pipeline and only later check in which cases it went wrong?
It is possible to use a type of Try
monad, for example from the cats
library:
It represents a computation that may either result in an exception or return a successfully computed value. Is very similar to the Either monad, but is semantically different.
It consists of two types: Success and Failure. The Success type is a simple wrapper, like Right of the Either monad. But the Failure type is slightly different from Left, because it always wraps an instance of Throwable (or any value in cljs since you can throw arbitrary values in the JavaScript host).
(...)
It is an analogue of the try-catch block: it replaces try-catch’s stack-based error handling with heap-based error handling. Instead of having an exception thrown and having to deal with it immediately in the same thread, it disconnects the error handling and recovery.
Heap-based error-handling is what you want.
Below I made an example of fetch-entity
and store-entity
. I made fetch-entity
throw an ExceptionInfo
on the first id (1) and store-entity
throws a DivideByZeroException
on the second id (0).
(ns your-project.core
(:require [cats.core :as cats]
[cats.monad.exception :as exc]))
(def ids [1 0 2]) ;; `fetch-entity` throws on 1, `store-entity` on 0, 2 works
(defn fetch-entity
"Throws an exception when the id is 1..."
[id]
(if (= id 1)
(throw (ex-info "id is 1, help!" {:id id}))
id))
(defn store-entity
"Unfortunately this function still needs to be aware that it receives a Try.
It throws a `DivideByZeroException` when the id is 0"
[id-try]
(if (exc/success? id-try) ; was the previous step a success?
(exc/try-on (/ 1 (exc/extract id-try))) ; if so: extract, apply fn, and rewrap
id-try)) ; else return original for later processing
(def results
(->> ids
(pmap #(exc/try-on (fetch-entity %)))
(pmap store-entity)))
Now you can filter results
on successes or failures with respectively success?
or failure?
and retrieve the values via cats-extract
(def successful-results
(->> results
(filter exc/success?)
(mapv cats/extract)))
successful-results ;; => [1/2]
(def error-messages
(->> results
(filter exc/failure?)
(mapv cats/extract) ; gets exceptions without raising them
(mapv #(.getMessage %))))
error-messages ;; => ["id is 1, help!" "Divide by zero"]
Note that if you want to only loop over the errors
or successful-results
once you can use a transducer as follows:
(transduce (comp
(filter exc/success?)
(map cats/extract))
conj
results))
;; => [1/2]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With