In OCaml, imperative-style loops can be exited early by raising exceptions.
While the use of imperative loops is not idiomatic per se in OCaml, I'd like to know what are the most idiomatic ways to emulate imperative loops with early exits (taking into account aspects such as performance, if possible).
For instance, an old OCaml FAQ mentions exception Exit
:
Exit
: used to jump out of loops or functions.
Is it still current? The standard library simply mentions it as a general-purpose exception:
The
Exit
exception is not raised by any library function. It is provided for use in your programs.
Relatedly, this answer to another question mentions using a precomputed let exit = Exit
exception to avoid allocations inside the loop. Is it still required?
Also, sometimes one wants to exit from the loop with a specific value, such as raise (Leave 42)
. Is there an idiomatic exception or naming convention to do this? Should I use references in this case (e.g. let res = ref -1 in ... <loop body> ... res := 42; raise Exit
)?
Finally, the use of Exit
in nested loops prevents some cases where one would like to exit several loops, like break <label>
in Java. This would require defining exceptions with different names, or at least using an integer to indicate how many scopes should be exited (e.g. Leave 2
to indicate that 2 levels should be exited). Again, is there an approach/exception naming that is idiomatic here?
OCaml doesn't support the concept of breaking out of a for loop early i.e. it has no break , continue or last statements. (You could throw an exception and catch it outside, and this would run fast but often looks clumsy.)
OCaml has exception handling to deal with exceptions, application-level errors, out-of-band data and the like. Exceptions are a very dynamic construct, but there is a third-party application that performs a static analysis of your use of exceptions and can find inconsistencies and errors at compile time!
The simplest one is failwith , which could be defined as follows: let failwith msg = raise (Failure msg);; val failwith : string -> 'a = <fun> OCaml. There are several other useful functions for raising exceptions, which can be found in the API documentation for the Common and Exn modules in Base.
As originally posted in comments, the idiomatic way to do early exit in OCaml is using continuations. At the point where you want the early return to go to, you create a continuation, and pass it to the code that might return early. This is more general than labels for loops, since you can exit from just about anything that has access to the continuation.
Also, as posted in comments, note the usage of raise_notrace
for exceptions whose trace you never want the runtime to generate.
A "naive" first attempt:
module Continuation :
sig
(* This is the flaw with this approach: there is no good choice for
the result type. *)
type 'a cont = 'a -> unit
(* with_early_exit f passes a function "k" to f. If f calls k,
execution resumes as if with_early_exit completed
immediately. *)
val with_early_exit : ('a cont -> 'a) -> 'a
end =
struct
type 'a cont = 'a -> unit
(* Early return is implemented by throwing an exception. The ref
cell is used to store the value with which the continuation is
called - this is a way to avoid having to generate an exception
type that can store 'a for each 'a this module is used with. The
integer is supposed to be a unique identifier for distinguishing
returns to different nested contexts. *)
type 'a context = 'a option ref * int64
exception Unwind of int64
let make_cont ((cell, id) : 'a context) =
fun result -> cell := Some result; raise_notrace (Unwind id)
let generate_id =
let last_id = ref 0L in
fun () -> last_id := Int64.add !last_id 1L; !last_id
let with_early_exit f =
let id = generate_id () in
let cell = ref None in
let cont : 'a cont = make_cont (cell, id) in
try
f cont
with Unwind i when i = id ->
match !cell with
| Some result -> result
(* This should never happen... *)
| None -> failwith "with_early_exit"
end
let _ =
let nested_function i k = k 15; i in
Continuation.with_early_exit (nested_function 42)
|> string_of_int
|> print_endline
As you can see, the above implements early exit by hiding an exception. The continuation is actually a partially applied function that knows the unique id of the context for which it was created, and has a reference cell to store the result value while the exception is being thrown to that context. The code above prints 15. You can pass the continuation k
as deep as you want. You can also define the function f
immediately at the point where it is passed to with_early_exit
, giving an effect similar to having a label on a loop. I use this very often.
The problem with the above is the result type of 'a cont
, which I arbitrarily set to unit
. Actually, a function of type 'a cont
never returns, so we want it to behave like raise
– be usable where any type is expected. However, this doesn't immediately work. If you do something like type ('a, 'b) cont = 'a -> 'b
, and pass that down to your nested function, the type checker will infer a type for 'b
in one context, and then force you to call continuations only in contexts with the same type, i.e. you won't be able to do things like
(if ... then 3 else k 15)
...
(if ... then "s" else k 16)
because the first expression forces 'b
to be int
, but the second requires 'b
to be string
.
To solve this, we need to provide a function analogous to raise
for early return, i.e.
(if ... then 3 else throw k 15)
...
(if ... then "s" else throw k 16)
This means stepping away from pure continuations. We have to un-partially-apply make_cont
above (and I renamed it to throw
), and pass the naked context around instead:
module BetterContinuation :
sig
type 'a context
val throw : 'a context -> 'a -> _
val with_early_exit : ('a context -> 'a) -> 'a
end =
struct
type 'a context = 'a option ref * int64
exception Unwind of int64
let throw ((cell, id) : 'a context) =
fun result -> cell := Some result; raise_notrace (Unwind id)
let generate_id = (* Same *)
let with_early_exit f =
let id = generate_id () in
let cell = ref None in
let context = (cell, id) in
try
f context
with Unwind i when i = id ->
match !cell with
| Some result -> result
| None -> failwith "with_early_exit"
end
let _ =
let nested_function i k = ignore (BetterContinuation.throw k 15); i in
BetterContinuation.with_early_exit (nested_function 42)
|> string_of_int
|> print_endline
The expression throw k v
can be used in contexts where different types are required.
I use this approach pervasively in some big applications I work on. I prefer it even to regular exceptions. I have a more elaborate variant, where with_early_exit
has a signature roughly like this:
val with_early_exit : ('a context -> 'b) -> ('a -> 'b) -> 'b
where the first function represents an attempt to do something, and the second represents the handler for errors of type 'a
that may result. Together with variants and polymorphic variants, this gives a more explicitly-typed take on exception handling. It is especially powerful with polymorphic variants, as the set of error variands can be inferred by the compiler.
The Jane Street approach effectively does the same as what is described here, and in fact I previously had an implementation that generated exception types with first-class modules. I am not sure anymore why I eventually chose this one – there may be subtle differences :)
Just to answer a specific part of my question which was not mentioned in other answers:
... using a precomputed let exit = Exit exception to avoid allocations inside the loop. Is it still required?
I did some micro-benchmarks using Core_bench
on 4.02.1+fp
and the results indicate no significant difference: when comparing two identical loops, one containing a local exit
declared before the loop and another one without it, the time difference is minimal.
The difference between raise Exit
and raise_notrace Exit
in this example was also minimal, about 2% in some runs, up to 7% in others, but it could well be within the error margins of such a short experiment.
Overall, I couldn't measure any noticeable difference, so unless someone would have examples where Exit/exit
significantly affect performance, I would prefer the former since it is clearer and avoids creating a mostly useless variable.
Finally, I also compared the difference between two idioms: using a reference to a value before exiting the loop, or creating a specific exception type containing the return value.
With reference to result value + Exit
:
let res = ref 0 in
let r =
try
for i = 0 to n-1 do
if a.(i) = v then
(res := v; raise_notrace Exit)
done;
assert false
with Exit -> !res
in ...
With specific exception type:
exception Res of int
let r =
try
for i = 0 to n-1 do
if a.(i) = v then
raise_notrace (Res v)
done;
assert false
with Res v -> v
in ...
Once again, the differences were minimal and varied a lot between runs. Overall, the first version (reference + Exit
) seemed to have a slight advantage (0% to 10% faster), but the difference was not significant enough to recommend one version over the another.
Since the former requires defining an initial value (which may not exist) or using an option type to initialize the reference, and the latter requires defining a new exception per type of value returned from the loop, there is no ideal solution here.
Exit
is ok (I'm not sure whether I can say that it is idiomatic). But, make sure, that you're using raise_notrace
, if you're using recent enough compiler (since 4.02).
Even better solution, is to use with_return
from OCaml Core library. It will not have any problems with scope, because it will create a fresh new exception type for each nesting.
Of course, you can achieve the same results, or just take the source code of Core's implementation.
And even more idiomatic, is not to use exceptions for short-circuiting your iteration, and consider to use existing algorithm (find
, find_map
, exists
, etc) or just write a recursive function, if no algorithm suits you.
Regarding the point
using a precomputed let exit = Exit exception to avoid allocations inside the loop. Is it still required?
the answer is no with sufficiently recent versions of OCaml. Here is the relevant excerpt from the Changelog of OCaml 4.02.0.
- PR#6203: Constant exception constructors no longer allocate (Alain Frisch)
Here is PR6203: http://caml.inria.fr/mantis/view.php?id=6203
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With