solrize in #haskell asked a question about one version of this code and I tried some other cases and was wondering what was going on. On my machine the "fast" code takes ~1 second and the "slow" code takes ~1.3-1.5 (everything is compiled with ghc -O2).
import Data.List
log10 :: Double -> Double
--log10 x = log x / log 10 -- fast
--log10 = logBase 10 -- slow
--log10 = barLogBase 10 -- fast
--log10 = bazLogBase 10 -- fast
log10 = fooLogBase 10 -- see below
class Foo a where
    fooLogBase :: a -> a -> a
instance Foo Double where
    --fooLogBase x y = log y / log x -- slow
    fooLogBase x = let lx = log x in \y -> log y / lx -- fast
barLogBase :: Double -> Double -> Double
barLogBase x y = log y / log x
bazLogBase :: Double -> Double -> Double
bazLogBase x = let lx = log x in \y -> log y / lx
main :: IO ()
main = print . foldl' (+) 0 . map log10 $ [1..1e7]
I'd've hoped that GHC would be able to turn logBase x y into exactly the same thing as log y / log x, when specialised. What's going on here, and what would be the recommended way of using logBase?
As always, look at the Core.
Fast (1.563s) -
-- note: top level constant, referred to by specialized fooLogBase
Main.main_lx :: GHC.Types.Double
Main.main_lx =
     case GHC.Prim.logDouble# 10.0 of { r ->
     GHC.Types.D# r
  }
Main.main7 :: GHC.Types.Double -> GHC.Types.Double
Main.main7 =
  \ (y :: GHC.Types.Double) ->
    case y of _ { GHC.Types.D# y# ->
    case GHC.Prim.logDouble# y# of { r0 ->
    case Main.main_lx of { GHC.Types.D# r ->
    case GHC.Prim./## r0 r of { r1 ->
    GHC.Types.D# r1
    }
    }
    }
Slow (2.013s)
-- simpler, but recomputes log10 each time
Main.main7 =
  \ (y_ahD :: GHC.Types.Double) ->
    case y_ahD of _ { GHC.Types.D# x_aCD ->
    case GHC.Prim.logDouble# x_aCD of wild1_aCF { __DEFAULT ->
    case GHC.Prim.logDouble# 10.0 of wild2_XD9 { __DEFAULT ->
    case GHC.Prim./## wild1_aCF wild2_XD9 of wild3_aCz { __DEFAULT ->
    GHC.Types.D# wild3_aCz
    }
    }
    }
    }
In the fast version, log10 is computed once and shared (the static argument is applied once only). In the slow version it is recomputed each time.
You can follow this line of reasoning to produce even better versions:
-- 1.30s
lx :: Double
lx = log 10
log10 :: Double -> Double
log10 y = log y / lx
main :: IO ()
main = print . foldl' (+) 0 . map log10 $ [1..1e7]
And, using array fusion, you can remove the penalty of the compositional style:
import qualified Data.Vector.Unboxed as V
lx :: Double
lx = log 10
log10 :: Double -> Double
log10 y = log y / lx
main :: IO ()
main = print . V.sum . V.map log10 $ V.enumFromN 1 (10^7)
Cutting the cost by 3x
$ time ./A
6.5657059080059275e7
real    0m0.672s
user    0m0.000s
sys     0m0.000s
Which is as good as writing it by hand. The below offers no benefit over the correctly written version above.
lx :: Double
lx = D# (GHC.Prim.logDouble# 10.0##)
log10 :: Double -> Double
log10 (D# y) = D# (case logDouble# y of r -> r /## d#)
    where
        D# d# = lx
main :: IO ()
main = print . V.sum . V.map log10 $ V.enumFromN 1 (10^7)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With