Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running a subexpression once

OK, so this has been bothering me for a while, so I thought I'd come and ask somebody who might actually know the answer.

Suppose I have the following function:

foobar x y = expensive x + cheap y

Suppose, further, that part of the program takes foobar 5 as input, and executes this function millions of times in a tight loop. Clearly, I want expensive 5 to be computed once, not a million times.

I could leave the code as it is, or I could change it to

foobar x = let k = expensive x in \ y -> k + cheap y

This leaves me wondering...

  • Is GHC smart enough to eliminate the duplicated work by itself? (I.e., does the first version do what I want already?)

  • If no, does the second version actually fix the problem? (I.e., will the optimiser just transform it back into the same code as the first version?)

like image 818
MathematicalOrchid Avatar asked Jul 27 '13 17:07

MathematicalOrchid


2 Answers

Is GHC smart enough to eliminate the duplicated work by itself? (I.e., does the first version do what I want already?)

I think another way of asking this is: Will GHC inline foobar x y so that expensive x is shared between computations?

I asked a similar question which cleared up a few things but left me a little unsatisfied. As I understand it, determining how and when to inline or eta-expand/reduce things (and when doing so subtly changes strictness behavior/semantics) is really tricky for the compiler, so GHC relies heavily on how you've defined your function syntactically.

I think the short answer is that GHC might transform your first function to the second, but the only way to be sure is to write your functions so the syntax gives the compiler hints about how you'd like things to be inlined to get the sharing you want, and then provide INLINE pragmas. Here's another interesting discussion of this issue

like image 55
jberryman Avatar answered Nov 02 '22 00:11

jberryman


Intuitively my answer would have been no, and yes. But let me answer you question by trying it out. Consider this code:

import Debug.Trace

expensive :: Int -> Int
expensive x = trace ("expensive evaluated for " ++ show x) $ x
{-# NOINLINE expensive #-}

cheap :: Int -> Int
cheap x = x
{-# NOINLINE cheap #-}

foobar x y = expensive x + cheap y

foobar' x = let k = expensive x in \ y -> k + cheap y

part f = sum [f i| i<-[0..10]]

main = do
    print $ part (foobar 5)
    print $ part (foobar' 5)

If we run this the result is

$ ./Test 
expensive evaluated for 5
110
expensive evaluated for 5
110

so the compiler was smart enough to optimize the original version as well. But why? Because he inlined the definition of foobar in main, then noticed that it could float the expression expensive 5 out of the call to part. If we disable inlining for foobar and foobar' (or alternatively not use -O), it does not work any more:

$ ./Test 
expensive evaluated for 5
expensive evaluated for 5
expensive evaluated for 5
expensive evaluated for 5
expensive evaluated for 5
expensive evaluated for 5
expensive evaluated for 5
expensive evaluated for 5
expensive evaluated for 5
expensive evaluated for 5
expensive evaluated for 5
110
expensive evaluated for 5
110

So while GHC can in some situations do the right thing, you should always check if it is the case if you want to rely on it. Either using tools like Debug.Trace, or by looking at the core (-ddump-simpl).

like image 29
Joachim Breitner Avatar answered Nov 01 '22 23:11

Joachim Breitner