Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Evaluation strategy

Tags:

How should one reason about function evaluation in examples like the following in Haskell:

let f x = ...     x = ... in map (g (f x)) xs 

In GHC, sometimes (f x) is evaluated only once, and sometimes once for each element in xs, depending on what exactly f and g are. This can be important when f x is an expensive computation. It has just tripped a Haskell beginner I was helping and I didn't know what to tell him other than that it is up to the compiler. Is there a better story?

Update

In the following example (f x) will be evaluated 4 times:

let f x = trace "!" $ zip x x     x = "abc" in map (\i -> lookup i (f x)) "abcd"  
like image 795
Grzegorz Chrupała Avatar asked Feb 24 '12 23:02

Grzegorz Chrupała


2 Answers

With language extensions, we can create situations where f x must be evaluated repeatedly:

{-# LANGUAGE GADTs, Rank2Types #-} module MultiEvG where  data BI where     B :: (Bounded b, Integral b) => b -> BI  foo :: [BI] -> [Integer] foo xs = let f :: (Integral c, Bounded c) => c -> c              f x = maxBound - x              g :: (forall a. (Integral a, Bounded a) => a) -> BI -> Integer              g m (B y) = toInteger (m + y)              x :: (Integral i) => i              x = 3          in map (g (f x)) xs 

The crux is to have f x polymorphic even as the argument of g, and we must create a situation where the type(s) at which it is needed can't be predicted (my first stab used an Either a b instead of BI, but when optimising, that of course led to only two evaluations of f x at most).

A polymorphic expression must be evaluated at least once for each type it is used at. That's one reason for the monomorphism restriction. However, when the range of types it can be needed at is restricted, it is possible to memoise the values at each type, and in some circumstances GHC does that (needs optimising, and I expect the number of types involved mustn't be too large). Here we confront it with what is basically an inhomogeneous list, so in each invocation of g (f x), it can be needed at an arbitrary type satisfying the constraints, so the computation cannot be lifted outside the map (technically, the compiler could still build a cache of the values at each used type, so it would be evaluated only once per type, but GHC doesn't, in all likelihood it wouldn't be worth the trouble).

  • Monomorphic expressions need only be evaluated once, they can be shared. Whether they are is up to the implementation; by purity, it doesn't change the semantics of the programme. If the expression is bound to a name, in practice you can rely on it being shared, since it's easy and obviously what the programmer wants. If it isn't bound to a name, it's a question of optimisation. With the bytecode generator or without optimisations, the expression will often be evaluated repeatedly, but with optimisations repeated evaluation would indicate a compiler bug.
  • Polymorphic expressions must be evaluated at least once for every type they're used at, but with optimisations, when GHC can see that it may be used multiple times at the same type, it will (usually) still be shared for that type during a larger computation.

Bottom line: Always compile with optimisations, help the compiler by binding expressions you want shared to a name, and give monomorphic type signatures where possible.

like image 167
Daniel Fischer Avatar answered Nov 07 '22 12:11

Daniel Fischer


Your examples are indeed quite different.

In the first example, the argument to map is g (f x) and is passed once to map most likely as partially applied function. Should g (f x), when applied to an argument within map evaluate its first argument, then this will be done only once and then the thunk (f x) will be updated with the result.

Hence, in your first example, f xwill be evaluated at most 1 time.

Your second example requires a deeper analysis before the compiler can arrive at the conclusion that (f x) is always constant in the lambda expression. Perhaps it will never optimize it at all, because it may have knowledge that trace is not quite kosher. So, this may evaluate 4 times when tracing, and 4 times or 1 time when not tracing.

like image 25
Ingo Avatar answered Nov 07 '22 12:11

Ingo