Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to keep very big elements on memory without exhausting the garbage collector?

In Haskell, I created a Vector of 1000000 IntMaps. I then used Gloss to render a picture in a way that accesses random intmaps from that vector.
That is, I had keep every single one of them in memory. The rendering function itself is very lightweight, so the performance was supposed to be good.
Yet, the program was running at 4fps. Upon profiling, I noticed 95% of the time was spent on GC. Fair enough:
The GC is crazily scanning my vector, even though it never changes.

Is there any way to tell GHC "this big value is needed and will not change - do not try to collect anything inside it".

Edit: the program below is sufficient to replicate the issue.

import qualified Data.IntMap as Map
import qualified Data.Vector as Vec
import Graphics.Gloss
import Graphics.Gloss.Interface.IO.Animate
import System.Random

main = do
    let size  = 10000000
    let gen i = Map.fromList $ zip [mod i 10..0] [0..mod i 10]
    let vec   = Vec.fromList $ map gen [0..size]
    let draw t = do 
            rnd <- randomIO :: IO Int
            let empty = Map.null $ vec Vec.! mod rnd size
            let rad   = if empty then 10 else 50
            return $ translate (20 * cos t) (20 * sin t) (circle rad)
    animateIO (InWindow "hi" (256,256) (1,1)) white draw

This accesses a random map on a huge vector and draws a rotating circle whose radius depend on whether the map is empty.
Despite that logic being very simple, the program struggles at around 1 FPS here.

like image 900
MaiaVictor Avatar asked May 14 '15 17:05

MaiaVictor


3 Answers

gloss is the culprit here.

First, a little background on GHC's garbage collector. GHC uses (by default) a generational, copying garbage collector. This means that the heap consists of several memory areas called generations. Objects are allocated into the youngest generation. When a generation becomes full, it is scanned for live objects and the live objects are copied into the next older generation, and then the generation that was scanned is marked as empty. When the oldest generation becomes full, the live objects are instead copied into a new version of the oldest generation.

An important fact to take away from this is that the GC only ever examines live objects. Dead objects are never touched at all. This is great when collecting generations that are mostly garbage, as often happens in the youngest generation. It's not good if long-lived data undergoes many GCs, as it will be copied repeatedly. (It can also be counterintuitive to those used to malloc/free-style memory management, where allocation and deallocation are both quite expensive, but leaving objects allocated for a long time has no direct cost.)

Now, the "generational hypothesis" is that most objects are either short-lived or long-lived. The long-lived objects will quickly end up in the oldest generation since they are alive at every collection. Meanwhile, most of the short-lived objects that are allocated will never survive the youngest generation; only those that happen to be alive when it is collected will be promoted to the next generation. Similarly, most of those short-lived objects that do get promoted will not survive to the third generation. As a result, the oldest generation that holds the long-lived objects should fill up very slowly, and its expensive collections that have to copy all the long-lived objects should occur rarely.

Now, all of this is actually true in your program, except for one problem:

    let displayFun backendRef = do
            -- extract the current time from the state
            timeS           <- animateSR `getsIORef` AN.stateAnimateTime

            -- call the user action to get the animation frame
            picture         <- frameOp (double2Float timeS)

            renderS         <- readIORef renderSR
            portS           <- viewStateViewPort <$> readIORef viewSR

            windowSize      <- getWindowDimensions backendRef

            -- render the frame
            displayPicture
                    windowSize
                    backColor
                    renderS
                    (viewPortScale portS)
                    (applyViewPortToPicture portS picture)

            -- perform GC every frame to try and avoid long pauses
            performGC

gloss tells the GC to collect the oldest generation every frame!

This might be a good idea if those collections are expected to take less time than the delay between frames anyways, but it's clearly not a good idea for your program. If you remove that performGC call from gloss, then your program runs quite quickly. Presumably if you let it run for long enough, then the oldest generation will eventually fill up and you might get a delay of a few tenths of a second as the GC copies all your long-lived data, but that's much better than paying that cost every frame.

All that said, there is a ticket #9052 about adding a stable generation, which would also suit your needs nicely. See there for more details.

like image 53
Reid Barton Avatar answered Oct 13 '22 22:10

Reid Barton


I would try compiling -with-rtsopts and then playing with the heap (-H) and/or allocator (-A) options. Those greatly influence how the GC works.

More info here: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/runtime-control.html

like image 20
jhickner Avatar answered Oct 13 '22 23:10

jhickner


To add to Reid's answer, I've found performMinorGC (added in https://ghc.haskell.org/trac/ghc/ticket/8257) to be the best of both worlds here.

Without any explicit GC scheduling, I still get frequent collection-related frame drops as the nursery becomes exhausted. But performGC indeed becomes performance-killing as soon as there is any significant long-lived memory usage.

performMinorGC does what we want, ignoring long-term memory and cleaning up the garbage from each frame predictably - especially if you tune -H and -A to ensure that the per-frame garbage fits in the nursery.

like image 1
Luke Iannini Avatar answered Oct 13 '22 21:10

Luke Iannini