I've been struggling with low-level manual loop optimization in GHC. My program contains some loops that perform numerical computation. The real data is wrapped in other data structures, and the program is broken down into "looping control flow functions" and "computation functions" in such a way that some data structure fields end up being read inside inner loops. I want GHC to move those reads out of the inner loops. Here's a simplified version of the code, to show what's going on.
data D = D !Double !C
data C = C Double
-- This function is called in every loop iteration.
-- Parameter 'c' is loop-invariant.
exampleLoopBody i a c =
case c of C b -> a + b * fromIntegral i
-- The body of this function is a counted loop that should be optimized
foo x =
case x
of D acc0 c ->
let loop i acc =
if i > 100
then acc
else loop (i+1) (exampleLoopBody i acc c)
in loop 0 acc0
Every loop iteration evaluates case c of C b
, but that is redundant computation because c
is loop-invariant. I can make GHC lift it out by putting a redundant case expression outside the loop:
foo x =
case x
of D acc0 c ->
case c -- This case statement inserted for optimization purposes
of C b -> b `seq` -- It will read 'b' outside of the loop
let loop i acc =
if i > 100
then acc
else loop (i+1) (exampleLoopBody i acc c)
in loop 0 acc0
The compiler inlines exampleLoopBody
. Afterward, the inner case statement is redundant and gets eliminated:
foo x =
case x
of D acc0 c ->
case c
of C b -> b `seq`
let loop i acc =
if i > 100
then acc
else loop (i+1) (acc + b * fromIntegral i) -- The inlined case expression disappears
in loop 0 acc0
The purpose of seq
is to ensure that the case expression is not dead code. The seq
checks whether b
is _|_
. GHC notices that, since b
has been computed, it is useful to reuse that value in the loop body.
Now, here is the problem: I really want all the relevant data fields to be strict. If I insert strictness annotations in the data definition, like this,
data C = C !Double
then the seq
and case c of C b
have no effect as far as GHC is concerned. GHC deletes them, and I get this:
foo x =
case x
of D acc0 c ->
let loop i acc =
if i > 100
then acc
else loop (i+1) (case c of C b -> acc + b * fromIntegral i) -- Evaluate the case in every iteration
in loop 0 acc0
This code evaluates case c of C b
in every iteration, which is just what I was trying to avoid.
If I can't rely on seq
, I don't know how else to force b
to be computed outside the loop body. Is there some trick I can use in this case?
You could try rearranging the arguments and moving the loop variant parts into a lambda:
-- note the order of the arguments changed
exampleLoopBody (C b) =
\i a -> a + b * fromIntegral i
foo (D acc0 c) =
let
loopBody = exampleLoopBody c
loop i acc =
if i > 100
then acc
else loop (i+1) (loopBody i acc)
in loop 0 acc0
Also, this code builds up a large unevaluated expression at the moment so you may want to force the accumulator parameter every time through the loop.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With