Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does performGC fail to release all memory?

Given the program:

import Language.Haskell.Exts.Annotated -- from haskell-src-exts
import System.Mem
import System.IO
import Control.Exception

main :: IO ()
main = do
  evaluate $ length $ show $ fromParseResult $ parseFileContents $ "data C = C {a :: F {- " ++ replicate 400000 'd' ++ " -}     }"
  performGC
  performGC
  performGC

Using GHC 7.0.3, when I run:

$ ghc --make Temp.hs -rtsopts && Temp.exe +RTS -G1 -S
    Alloc    Copied     Live    GC    GC     TOT     TOT  Page Flts
    bytes     bytes     bytes  user  elap    user    elap
 ...
 29463264        64   8380480  0.00  0.00    0.64    0.85    0    0  (Gen:  0)
       20        56   8380472  0.00  0.00    0.64    0.86    0    0  (Gen:  0)
        0        56   8380472  0.00  0.00    0.64    0.87    0    0  (Gen:  0)
    42256       780     33452  0.00  0.00    0.64    0.88    0    0  (Gen:  0)
        0                      0.00  0.00

The performGC call seems to leave 8Mb of memory live, even though it seems like all the memory should be dead. How come?

(Without -G1 I see 10Mb live at the end, which I also can't explain.)

like image 398
Neil Mitchell Avatar asked Jun 12 '11 18:06

Neil Mitchell


2 Answers

Here's what I see (after inserting a print before the last performGC, to help tag when things happen.

   524288    524296  32381000  0.00  0.00    1.15    1.95    0    0  (Gen:  0)
   524288    524296  31856824  0.00  0.00    1.16    1.96    0    0  (Gen:  0)
   368248       808   1032992  0.00  0.02    1.16    1.99    0    0  (Gen:  1)
        0       808   1032992  0.00  0.00    1.16    1.99    0    0  (Gen:  1)
"performed!"
    39464      2200   1058952  0.00  0.00    1.16    1.99    0    0  (Gen:  1)
    22264      1560   1075992  0.00  0.00    1.16    2.00    0    0  (Gen:  0)
        0                      0.00  0.00

So after GCs there is still 1M on the heap (without -G1). With -G1 I see:

 34340656  20520040  20524800  0.10  0.12    0.76    0.85    0    0  (Gen:  0)
 41697072  24917800  24922560  0.12  0.14    0.91    1.01    0    0  (Gen:  0)
 70790776       800   2081568  0.00  0.02    1.04    1.20    0    0  (Gen:  0)
        0       800   2081568  0.00  0.00    1.04    1.20    0    0  (Gen:  0)
"performed!"
    39464      2184   1058952  0.00  0.00    1.05    1.21    0    0  (Gen:  0)
    22264      2856     43784  0.00  0.00    1.05    1.21    0    0  (Gen:  0)
        0                      0.00  0.00

So about 2M. This is on x86_64/Linux.

Let's think about the STG machine storage model to see if there's something else on the heap.

Things that could be in that 1M of space:

  • CAFs for things like [], string constants, and the small Int and Char pool, plus things in libraries, the stdin MVar?
  • Thread State Objects (TSOs) for the main thread.
  • Any allocated signal handlers.
  • The IO manager Haskell code.
  • Sparks in the spark pool

From experience, this figure of slightly less than 1M seems to be the default "footprint" of a GHC binary. That's about what I've seen in other programs as well (e.g. shootout program smallest footprints are never less than 900K).

Perhaps the profiler can say something. Here's the -hT profile (no profiling libs needed), after I insert a minimal busy loop at the end to string out the tail:

 $ ./A +RTS -K10M -S -hT -i0.001    

Results in this graph:


enter image description here


Victory! Look at that ~1M thread stack object sitting there!

I don't know of a way to make TSOs smaller.


The code that produced the above graph:

import Language.Haskell.Exts.Annotated -- from haskell-src-exts
import System.Mem
import System.IO
import Data.Int
import Control.Exception

main :: IO ()
main = do
  evaluate $ length $ show $ fromParseResult 
           $ parseFileContents 
           $ "data C = C {a :: F {- " ++ replicate 400000 'd' ++ " -}     }"
  performGC
  performGC
  print "performed!"
  performGC

  -- busy loop so we can sample what's left on the heap.
  let go :: Int32 -> IO ()
      go  0 = return ()
      go  n = go $! n-1
  go (maxBound :: Int32)
like image 76
Don Stewart Avatar answered Oct 28 '22 13:10

Don Stewart


Compiling the code with -O -ddump-simpl, I see the following global definition in the simplifier output:

lvl2_r12F :: [GHC.Types.Char]
[GblId]
lvl2_r12F =
  GHC.Base.unpackAppendCString# "data C = C {a :: F {- " lvl1_r12D

The input to the parser function has become a global string constant. Globals are never garbage collected in GHC, so that's probably what's occupying the 8MB of memory after garbage colleciton.

like image 35
Heatsink Avatar answered Oct 28 '22 12:10

Heatsink