I've often read that exceptions are somewhat slow and should be avoided if performance is an issue (for instance, in Java, F#, etc). Does that apply to common OCaml functions such as Hashtbl.find
, which return exceptions for elements not found?
In particular, if I want my application to be efficient, should I always test element membership using, for instance, Hashtable.mem
before calling Hashtbl.find
? Or would the extra comparison of the mem
function negatively impact performance?
OCaml has exception handling to deal with exceptions, application-level errors, out-of-band data and the like. Exceptions are a very dynamic construct, but there is a third-party application that performs a static analysis of your use of exceptions and can find inconsistencies and errors at compile time!
The simplest one is failwith , which could be defined as follows: let failwith msg = raise (Failure msg);; val failwith : string -> 'a = <fun> OCaml. There are several other useful functions for raising exceptions, which can be found in the API documentation for the Common and Exn modules in Base.
A value Ok x means that the computation succeeded with x , and a value Error e means that it failed. Pattern matching can be used to deal with both cases, as with any other sum type.
OCaml has an exception mechanism similar to many other programming languages. A new type of OCaml exception is defined with this syntax: where E is a constructor name and t is a type. The of t is optional. Notice how this is similar to defining a constructor of a variant type. For example:
As we originally said, every OCaml expression either or fails to terminate (i.e., an “infinite loop”). So far we’ve only presented the part of the dynamic semantics that handles the first of those three cases. What happens when we add exceptions? Now, evaluation of an expression either produces a value or produces an exception packet .
Packets are not normal OCaml values; the only pieces of the language that recognizes them are raise and try. The exception value produced by (e.g.) Failure "oops" is part of the exception packet produced by raise (Failure "oops"), but the packet contains more than just the exception value; there can also be a stack trace, for example.
Here, for example, is an exception value whose constructor is Failure, which carries a string: This constructor is pre-defined in the standard library and is one of the more common exceptions that OCaml programmers use. There is a convenient function failwith : string -> 'a in the standard library that raises Failure.
OCaml exception handling make raising and catching exceptions extremely fast -- see this SO thread for internal details on how it's implemented. I haven't taken care to benchmark it precisely, but my random guess would be that it is in the ballpark of an indirect function call.
It is known that OCaml exceptions are significantly faster, in proportion to the rest of the language, than exceptions of F# for example -- this gave rise to performance problem for people porting their code from OCaml to F#. In OCaml, exceptions don't cause performance problems.
Calling Hashtbl.mem
before Hashtbl.find
is likely to be slower than catching the exception. The idiomatic style tends to be try Hashtbl.find .. with Not_found -> ...
.
That said, there is a sensible movement in the OCaml community to use more explicit error handling style, using option
types rather than exceptions. The rationale is not based on performances, but on the fact that the type-checker then stop you from forgetting to handle an error situation. I would prefer that when designing a fresh new API. When using a third-party function that raise exceptions, make sure to catch all possible exceptions immediately; doing otherwise is a design smell in general and should be very heavily justified.
Due to their convenience, OCaml exceptions are also often used as a pure control-flow mechanism (and not to signal a rare failure condition). You will encounter code such as:
try
for i = 0 to .... do
if .. then raise Exit
done; false
with Exit -> true
Finally, I feel that you might be taking a bad approach to implementation choices. Asking general micro-questions about performances is generally not the way to go. Think of correctness and readability first. Performance questions should generally come later, and only in situation where it is measurable/profilable.
To first answer the concrete question for Hashtbl.mem
before Hashtbl.find
:
Don't do it as the check for the existence of the element in the table will have to be dont twice; while the cost will be O(1) for both strategies, calling Hashtbl.mem
first will compute the hash-value and the lookup twice -- which can take quite longer than fetching an exception.
As a general advice, only create functions that raise exceptions if the probability of an exception is low. The exceptions are not taken into consideration for the type system, so more robust implementations will have an 'a option
return value instead of 'a
plus a possible exception. The ocaml core library uses a suffix of _exn
to make clear that a function may throw an exception but generally prefers the non-exception throwing ones.
So for the hashtable you should (in a perfect world) have two functions:
Hashtbl.find_exn : 'a t -> key -> 'a (* throws Not_found *)
Hashtbl.find : 'a t -> key -> 'a option
If in doubt, use the second one. If it is not for pure speed, you could write a simple wrapper around the original hash-table:
find_safe h k = try Some (Hashtbl.find h k) with Not_found -> None
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With