Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is `logBase 10 x` slower than `log x / log 10`, even when specialized?

solrize in #haskell asked a question about one version of this code and I tried some other cases and was wondering what was going on. On my machine the "fast" code takes ~1 second and the "slow" code takes ~1.3-1.5 (everything is compiled with ghc -O2).

import Data.List

log10 :: Double -> Double
--log10 x = log x / log 10 -- fast
--log10 = logBase 10 -- slow
--log10 = barLogBase 10 -- fast
--log10 = bazLogBase 10 -- fast
log10 = fooLogBase 10 -- see below

class Foo a where
    fooLogBase :: a -> a -> a

instance Foo Double where
    --fooLogBase x y = log y / log x -- slow
    fooLogBase x = let lx = log x in \y -> log y / lx -- fast


barLogBase :: Double -> Double -> Double
barLogBase x y = log y / log x

bazLogBase :: Double -> Double -> Double
bazLogBase x = let lx = log x in \y -> log y / lx


main :: IO ()
main = print . foldl' (+) 0 . map log10 $ [1..1e7]

I'd've hoped that GHC would be able to turn logBase x y into exactly the same thing as log y / log x, when specialised. What's going on here, and what would be the recommended way of using logBase?

like image 697
shachaf Avatar asked Jul 02 '12 09:07

shachaf


1 Answers

As always, look at the Core.

Fast (1.563s) -

-- note: top level constant, referred to by specialized fooLogBase

Main.main_lx :: GHC.Types.Double
Main.main_lx =
     case GHC.Prim.logDouble# 10.0 of { r ->
     GHC.Types.D# r
  }

Main.main7 :: GHC.Types.Double -> GHC.Types.Double
Main.main7 =
  \ (y :: GHC.Types.Double) ->
    case y of _ { GHC.Types.D# y# ->
    case GHC.Prim.logDouble# y# of { r0 ->
    case Main.main_lx of { GHC.Types.D# r ->
    case GHC.Prim./## r0 r of { r1 ->
    GHC.Types.D# r1
    }
    }
    }

Slow (2.013s)

-- simpler, but recomputes log10 each time
Main.main7 =
  \ (y_ahD :: GHC.Types.Double) ->
    case y_ahD of _ { GHC.Types.D# x_aCD ->
    case GHC.Prim.logDouble# x_aCD of wild1_aCF { __DEFAULT ->
    case GHC.Prim.logDouble# 10.0 of wild2_XD9 { __DEFAULT ->
    case GHC.Prim./## wild1_aCF wild2_XD9 of wild3_aCz { __DEFAULT ->
    GHC.Types.D# wild3_aCz
    }
    }
    }
    }

In the fast version, log10 is computed once and shared (the static argument is applied once only). In the slow version it is recomputed each time.

You can follow this line of reasoning to produce even better versions:

-- 1.30s
lx :: Double
lx = log 10

log10 :: Double -> Double
log10 y = log y / lx

main :: IO ()
main = print . foldl' (+) 0 . map log10 $ [1..1e7]

And, using array fusion, you can remove the penalty of the compositional style:

import qualified Data.Vector.Unboxed as V

lx :: Double
lx = log 10

log10 :: Double -> Double
log10 y = log y / lx

main :: IO ()
main = print . V.sum . V.map log10 $ V.enumFromN 1 (10^7)

Cutting the cost by 3x

$ time ./A
6.5657059080059275e7

real    0m0.672s
user    0m0.000s
sys     0m0.000s

Which is as good as writing it by hand. The below offers no benefit over the correctly written version above.

lx :: Double
lx = D# (GHC.Prim.logDouble# 10.0##)

log10 :: Double -> Double
log10 (D# y) = D# (case logDouble# y of r -> r /## d#)
    where
        D# d# = lx

main :: IO ()
main = print . V.sum . V.map log10 $ V.enumFromN 1 (10^7)
like image 81
Don Stewart Avatar answered Oct 12 '22 12:10

Don Stewart