This code leaks or doesn't depending on how I compile the same RWS implementation:
import Control.Monad (replicateM_)
import qualified Control.Monad.RWS.CPS as RWS
import Data.Monoid (Sum (..))
-- import qualified RWS
{-# ann module "HLint: ignore Use camelCase" #-}
count_a_lot :: Int -> RWS.RWS () (Sum Int) () ()
count_a_lot = flip replicateM_ count
count :: RWS.RWS () (Sum Int) () ()
count = RWS.tell . Sum $ 1
main :: IO ()
main = print . snd $ RWS.evalRWS (count_a_lot 10000000) () ()
I experimented with two ways of using the CPS version of RWS:
import qualified Control.Monad.RWS.CPS as RWS
uses the writer-cps-transformers
package, from nixpkgs.import qualified RWS
uses Internal.hs, copied verbatim as RWS.hs
to the same local directory as the file pasted above.If I compile the code above and run it, I can see that it runs in constant space.
If I change the imports to compile and link the copy of RWS.hs
import Control.Monad (replicateM_)
-- import qualified Control.Monad.RWS.CPS as RWS
import Data.Monoid (Sum (..))
import qualified RWS
I get a space leak. The black retainer there is (103)<*>.\
, and the blue is SYSTEM
Why can this be? Or how should I best debug this?
In both cases, I am compiling with ghc (no cabal) with -Wall -O2 -prof -fprof-auto -rtsopts -fexternal-interpreter
. I can post more detailed compiler invocation and output lines if it were relevant.
Using ghc-8.4.4
, transformers-0.5.5.0
and writer-cps-transformers-0.1.1.4
. I know these versions are not current, but I am interested in knowing what is going on, rather on solving the actual leak, so I assume the versions aren't that relevant.
Compiling - The modified source code is compiled into binary object code. This code is not yet executable. Linking - The object code is combined with required supporting code to make an executable program. This step typically involves adding in any libraries that are required.
There are two main categories of linking - Static Linking and Dynamic Linking.
The compiler does its thing, and the linker does its thing -- by keeping the functions separate, the complexity of the program is reduced. Another (more obvious) advantage is that this allows the creation of large programs without having to redo the compilation step every time a file is changed.
The -Wl, is the start of the linker command options, which you have --export-dynamic as the single linker option. This tell gcc to compile the file using the compiler options shown (and those generated from the call to pkg-config , then call the linker With linker options ,--export-dynamic .
The culprit is, in fact, the profiling system. Note that you don't actually need -prof
to detect the space leak—the RTS option -s
can print out a "total memory" measurement without it. Less scientifically, the space leak makes the program a lot slower, and you can just feel it. Armed with this, I found that disabling -prof
reduced the memory usage of the "local" version to 2MB (the same as the "library" version), and leaving it on used ~1.9GB.
The reason profiling can make things slower is because GHC refuses to optimize as well. It can't aggressively restructure the code you write anymore, because the cost centers imply a certain structure to the code, and sometimes there's no good place for the cost center after an optimization, thereby blocking that optimization. Knowing precisely what's going wrong here would require knowing the flags with which you built the library, but the high-level explanation is that the writer-cps-transformers
library was built with less aggressive profiling (and therefore more aggressive optimization) than the RWS.hs
file. -fprof-auto
is a very aggressive profiling option and it can easily break lots of optimizations. If I use a writer-cps-transformers
built with -fprof-auto
, I get the same issue. If I use a RWS.hs
without profiling or with something weaker, like -fprof-auto-exported
, then I fix the issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With