It seems that Haskell tries to be a safe language, and tries to help the programmer from mistakes. For example, pred
/succ
throws error if outside, and div 1 0
also throws. What are these safe Haskell computations, and what overhead do they cause?
Is it possible to turn off this kind of safety for GHC, since they should not be necessary in a bug-free program? And can that cause better speed performance?
For the C backend, there was an option -ffast-math
. Is there any such performance options for the LLVM backend or LLVM?
The benchmark was indeed seriously flawed in the previous version of this answer. I apologise.
Indeed, pred
, succ
and other functions raise exceptions when various errors occur, such as overflow and division by zero. Normal arithmetic functions are just wrappers around low-level unsafe functions; as an example, look at the implementation of div
for Int32
:
div x@(I32# x#) y@(I32# y#)
| y == 0 = divZeroError
| x == minBound && y == (-1) = overflowError
| otherwise = I32# (x# `divInt32#` y#)
You can notice that there are two checks before the actual division is performed!
These aren’t the worst ones, however. We have bounds range checking for arrays — sometimes it slows down the code greatly. This particular problem is traditionally solved by providing special variants of functions with checks disabled (such as unsafeAt
).
As pointed out by Daniel Fischer here, there is a solution which does let you to disable/enable checks with a single pragma. Unfortunately, it is quite cumbersome: you’ll have to copy the source of GHC.Int and cut out the checks from every function. And GHC.Int isn’t the only source of such functions, of course.
If you really want to be able to disable the checks, you’ll have to:
import Prelude hiding (succ, pred, div, ...)
and import Unsafe (succ, pred, div, ...)
. The latter variant wouldn’t allow as simple switching between safe and unsafe functions, though.Suppose that there is a number which is known not to be zero (hence the checks aren’t needed). Now, to whom it is known? Either to compiler, or to you. In the first case we can expect the compiler not to perform any checks, of course. But in the second case our knowledge is useless — unless we can somehow tell the compiler about it. So, the problem is: how to encode the knowledge we have? And this is a well-known problem, with multiple solutions. The obvious solution is to make the programmer expicitly use the unsafe function (unsafeRem
). Another solution is to introduce some compiler magic:
{-# ASSUME x/=0 #-}
gcd x y = ...
But we functional programmers have types. And we’re used to encoding information with types. And some of us are great at it. So, the smartest solution would be to either introduce a family of Unsafe
types, or switch to dependent types (i.e. learn Agda).
For more information, please read about non-empty lists. The concern there is more about safety than performance, but the problem is same.
Let’s try to measure the difference between safe and unafe rem
:
{-# LANGUAGE MagicHash #-}
import GHC.Exts
import Criterion.Main
--assuming a >= b
--the type signatures are needed to prevent defaulting to Integer
safeGCD, unsafeGCD :: Int -> Int -> Int
safeGCD a b = if b == 0 then a else safeGCD b (rem a b)
unsafeGCD a b = if b == 0 then a else unsafeGCD b (unsafeRem a b)
{-# INLINE unsafeRem #-}
unsafeRem (I# a) (I# b) = I# (remInt# a b)
main = defaultMain [bench "safe" $ whnf (safeGCD 12452650) 11090050,
bench "unsafe" $ whnf (unsafeGCD 12452650) 11090050]
The difference doesn’t seem to be that huge:
$ ghc -O2 ../bench/bench.hs && ../bench/bench
benchmarking unsafe
mean: 215.8124 ns, lb 212.4020 ns, ub 220.1521 ns, ci 0.950
std dev: 19.71321 ns, lb 16.04204 ns, ub 23.83883 ns, ci 0.950
benchmarking safe
mean: 250.8196 ns, lb 246.7827 ns, ub 256.1225 ns, ci 0.950
std dev: 23.44088 ns, lb 19.06654 ns, ub 28.23992 ns, ci 0.950
A clarification of what safety overhead is being added.
First of all, if a safety measure can lead to an exception, you can learn about it here. There is a list of all types of exceptions that can be thrown.
Programmer-caused exceptions (no artificial overhead):
ErrorCall
: caused by error
:AssertionFailed
: caused by assert
.Exceptions thrown by standard libraries (rewrite the library and the safety overhead is gone):
ArithException
: division-by-zero is one of those. Also covers overflow/underflow and some less common ones.ArrayException
: happens when the index is out of bounds or when you try to reference an unfefined element.IOException
: don’t worry about those, the overhead is bleak in comparison to the IO overhead.Runtime exceptions (caused by GHC, unavoidable):
AsyncException
: stack and heap overflows. Only minor overhead.PatternMatchFail
: no overhead (the same way the else
in if...then...else...
doesn’t create any).Rec*Error
: happens when you are trying to address a non-existend field of a record. Causes some overhead since a check of field’s existence has to be performed.NoMethodError
: no overhead.Second, if there exists a safety measure that does not cause an exception, I would really like to hear about it (and then file a bug against GHC).
By the by, -ffast-math
wasn’t affecting any checks (they were done in Haskell code, not in C). It was simply making floating-point operations faster at the expense of precision at some edge cases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With