Runhaskell performance anomaly

Question

I'm trying to understand a performance anomaly being observed when running a program under runhaskell.

The program in question is:

isFactor n = (0 ==) . (mod n)
factors x = filter (isFactor x) [2..x]
main = putStrLn $ show $ sum $ factors 10000000

When I run this, it takes 1.18 seconds.

However, if I redefine isFactor as:

isFactor n f = (0 ==) (mod n f)

then the program takes 17.7 seconds.

That's a huge difference in performance and I would expect the programs to be equivalent. Does anybody know what I'm missing here?

Note: This doesn't happen when compiled under GHC.

John L · Accepted Answer

Although the functions should be identical, there's a difference in how they're applied. With the first definition, isFactor is fully applied at the call site isFactor x. In the second definition, it isn't, because now isFactor explicitly takes two arguments.

Even minimal optimizations are enough for GHC to see through this and create identical code for both, however if you compile with -O0 -ddump-simpl you can determine that, with no optimizations, this makes a difference (at least with ghc-7.2.1, YMMV with other versions).

With the first isFactor, GHC creates a single function that is passed as the predicate to "GHC.List.Filter", with the calls to mod 10000000 and (==) inlined. For the second definition, what happens instead is that most of the calls within isFactor are let-bound references to class functions, and are not shared between multiple invocations of isFactor. So there's a lot of dictionary overhead that's completely unnecessary.

This is almost never an issue because even the default compiler settings will optimize it away, however runhaskell apparently doesn't even do that much. Even so, I have occasionally structured code as someFun x y = \z -> because I know that someFun will be partially applied and this was the only way to preserve sharing between calls (i.e. GHC's optimizer wasn't sufficiently clever).

MathematicalOrchid · Answer

As I understand it, runhaskell does little to no optimisation. It's designed to quickly load and run code. If it did more optimisation, it would take longer for your code to start running. Of course, in this case, the code runs faster with optimisations.

As I understand it, if a compiled version of the code exists, then runhaskell will use it. So if performance matters to you, just make sure you compile with optimisations turned on first. (I think you might even be able to pass switches to runhaskell to turn on optimisations - you'd have to check the documentation...)

Runhaskell performance anomaly

Tags:

haskell

runhaskell

Steve

2 Answers

John L

MathematicalOrchid

Recent Activity

Donate For Us

Runhaskell performance anomaly

Tags:

haskell

runhaskell

Steve

2 Answers

John L

MathematicalOrchid

Related questions

Recent Activity

Donate For Us