I have a problem with Haskell module optimization.
There is Main module.
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Control.DeepSeq
import Formatting
import Formatting.Clock
import System.Clock
import Data.Array
size :: Int
size = 200 :: Int
stdMult :: (Ix a, Ix b, Ix c, Num d) =>
Array (a,b) d -> Array (b,c) d -> Array (a,c) d
stdMult x y = array resultBounds
[((i,j), sum [ x!(i,k) * y!(k,j) | k <- range (lj,uj)])
| i <- range (li,ui),
j <- range (lj',uj') ]
where ((li,lj),(ui,uj)) = bounds x
((li',lj'),(ui',uj')) = bounds y
resultBounds
| (lj,uj)==(li',ui') = ((li,lj'),(ui,uj'))
| otherwise = error "error"
main :: IO ()
main = do
let a = array ((1,1),(size, size)) [((i,j), 2*i-j) |
i <- range (1,size),
j <- range (1,size)]
let b = array ((1,1),(size, size)) [((i,j), 2*i+3*j) |
i <- range (1,size)`,
j <- range (1,size)]
start <- getTime ProcessCPUTime
let
c = stdMult a b
end <- c `deepseq` getTime ProcessCPUTime
fprint (timeSpecs % "\n") start end
return()
When stdMult
in Main module, everything works ok. I replace stdMult
to another module.
When I don't use ghc optimization, execution time is the same.
When I use ghc options -O3, when stdMult
in Main module time execution decreases, but when stdMult
in another module, execution time is almost unchanged!
For example, when stdMult
in Main I have time 3 seconds, and when stdMult
not in Main I have time 30 seconds, for matrix 500x500.
It is very strange!
(You need the clock
and formatting
packages from Hackage to compile the code.)
I can reproduce the 10x slowdown when stdMult
is in a different module. Luckily a fix is easy: in the module where stdMult
is defined, add an INLINABLE
pragma:
{-# INLINABLE stdMult #-}
It adds the definition to the interface file (.hi
) which allows inlining in the modules that uses it, which in turn allows it to be specialized to fast machine Int
instead of slow abstract Ix
and Num
polymorphic code. (If it's in the same module GHC can inline and specialize at will, and things aren't INLINABLE
by default because it can cause executable code bloat and slower compilation.)
Alternatively to INLINABLE
, you can manually SPECIALIZE
to the types you want optimized implementations for. This is a bit more verbose, but should be faster to compile in big projects (it will be specialized once per export, instead of once per import, at a rough guess).
{-# SPECIALIZE stdMult :: Array (Int, Int) Int -> Array (Int, Int) Int -> Array (Int, Int) Int #-}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With