Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell - Excessive memory usage despite foldl' and strict data structures

Tags:

I have a program that is using "excessive" memory given what the program should be doing. Snippet of the problematic code (per GHC heap profiling with +RTS -h) follows below.

import Data.IntMap.Strict (IntMap)
import qualified Data.IntMap.Strict

weeklyOwnerCache :: (RealFloat v) => [Int] -> [Int] -> Int -> v -> IntMap (v, Integer) -> IntMap (v, Integer)
weeklyOwnerCache rosterStates rosterIndices player projection cache = foldl' updateState cache rosterStates
  where
    updateState c s = Data.IntMap.Strict.insert s (optimalRosterAtState rosterIndices player projection c s) c

optimalRosterAtState :: (RealFloat v) => [Int] -> Int -> v -> IntMap (v, Integer) -> Int -> (v, Integer)
optimalRosterAtState rosterIndices player projection cache state = foldl' (\p r->max p (selectPlayer r)) (priorValue, priorSelection) validRosterIndices
  where
    (priorValue, priorSelection) = cache Data.IntMap.Strict.! state
    validRosterIndices = filter (isValidAssignment state) rosterIndices
    selectPlayer r = selection r player projection cache state

selection :: (RealFloat v) => Int -> Int -> v -> IntMap (v, Integer) -> Int -> (v, Integer)
selection rosterIndex player projection cache state = (newValue, newSelections)
  where
    rosterSpotValue = 10 ^ rosterIndex
    priorState = state - rosterSpotValue
    (priorValue, priorSelections) = cache Data.IntMap.Strict.! priorState
    newValue = priorValue + projection
    newSelections = priorSelections .|. bit player

isValidAssignment :: Int -> Int -> Bool
isValidAssignment rosterState rosterIndex = digitValue rosterState rosterIndex > 0

digitValue :: Int -> Int -> Int
digitValue n i = (n `quot` (10 ^ i)) `rem` 10

Despite using foldl' and Data.IntMap.Strict to attempt to control the memory footprint, it still requires many multiples beyond what I'd consider reasonable:

  • There are 8 owners and 16 weeks = 108 total caches (each an IntMap (v, Integer)).
  • Each cache has 128 Int keys, ranging in value from 0 to 12221.
  • Each value (Double, Integer), and the Integer has at most 15 bits set (in a range of 0 to ~200) at any time. I'd think this could fit into a few words ... maybe 64 bytes?
  • In total, that's 108 caches * 128 keys * (8 key size + ~72 value size) = ~1.1M bytes
  • But this block of code consumes ~220MB in small runs (i.e. 10 iterations of the loop that includes this code) and, despite it being fixed in size, consumes 2GB+ @ 100 iterations. No other aspect of my code base has the same linear memory consumption.

I have refactored out a list comprehension to use foldl' specifically in weeklyOwnerCache to no avail. Are there any other lines of code that seem to have problematic memory consumption properties?

Edited to add one other relevant piece of information while working on minimal reproducing example

I've made the edits suggested below (i.e. using bang patterns to force evaluation), to no avail. While I'm extracting the code into a minimal reproducing example, there is one other piece of information that may be relevant.

The weeklyCache IntMap objects sit in a HashMap o [IntMap (Selection v)] structure (keyed by owner, each element of the ~16-long list being one weeklyCache). Is it possible that using a lazy list causes this?

The other possibility is that the newGame tuple itself (which is my "game rules interface" for all other library modules) is causing this. Based on the feedback below, if the newGame tuple is only being evaluated to WHNF before being passed around my library in Main.hs (e.g. to 100-10000 simulations, etc.), it's possible that this is my issue and I should have a strict parameterized data Game ... type as an interface instead of a tuple.

Full library code below for the Rules portion of the executable:

{-# LANGUAGE DerivingStrategies#-}
{-# LANGUAGE BangPatterns#-}

module Games.Rules.Draft ( newGame ) where

import Data.Bits ( xor, (.&.), (.|.), complement, popCount, clearBit, setBit, bit)
import Data.Hashable ( Hashable ( hashWithSalt ) )
import Data.List ( foldl' )

import Data.Array (Array)
import qualified Data.Array
import Data.HashMap.Strict (HashMap)
import qualified Data.HashMap.Strict
import Data.IntMap.Strict (IntMap)
import qualified Data.IntMap.Strict

--import Debug.Trace ( trace )

data Selection v = Selection !v !Integer deriving stock (Eq, Ord, Show)

data State o v = State
  { pick :: Int
  , selected :: Integer --relies on ownerToIndex to shift bits.
  , remaining :: Integer
  , caches :: HashMap o [IntMap (Selection v)] -- indexed by owner, then rosterState
  } deriving stock (Eq, Ord, Show)

instance Hashable (State o v) where
  hashWithSalt s (State _ sel _ _) =
    s `hashWithSalt` sel


--newGame :: (Eq o, Hashable o, Eq r, Hashable r, Eq p, Hashable p, RealFloat v, Show o, Show v)
newGame :: (Eq o, Hashable o, Eq r, Hashable r, Eq p, Hashable p, RealFloat v)
  => [o] -> Int -> Bool -> [(n, t, [p], [v])] -> [(p, [r])] -> [(r, Int)]
  -> (State o v, [o], State o v-> Bool, State o v-> o, State o v-> [Int], State o v-> Int -> o -> v, State o v -> Int -> State o v)
newGame owners rounds isSnake players positionToRosterSpots startingRosterSpots = (iS, owners, iT, cO, m, r, ns)
  where
    -- helper
    ownerCount = length owners
    ownerFromIndex = Data.Array.listArray (0, ownerCount-1) owners
    ownerToIndex = Data.HashMap.Strict.fromList $ zip owners [0..ownerCount-1]
    playerCount = length players
    (_, _, _, proj) = head players
    weekCount = length proj
    playerProjections = Data.Array.listArray (0, playerCount-1) [projections | (_, _, _, projections) <- players]
    rosterSpotCount = length startingRosterSpots
    rosterSpotIndices = [rosterSpotCount-1, rosterSpotCount-2..0]
    rosterSpotToIndex = Data.HashMap.Strict.fromList $ zip (map fst startingRosterSpots) rosterSpotIndices
    rosterSpotIndicesFromPositions = positionsToRosterSpotIndices positionToRosterSpots rosterSpotToIndex
    playerRosterIndices = Data.Array.listArray (0, playerCount-1) [rosterSpotIndicesFromPositions pos | (_, _, pos, _) <- players]
    rosterStates = rosterInts startingRosterSpots
    fullProblem = head rosterStates
    -- API
    iS = initialState owners playerCount weekCount rosterStates
    iT state = pick state == min (ownerCount * rounds) playerCount
    cO = currentOwner isSnake ownerCount ownerFromIndex
    m = setBits . remaining
    r = reward cO playerProjections playerRosterIndices fullProblem
    ns = nextState cO ownerToIndex playerCount playerProjections playerRosterIndices rosterStates

-- API functions

initialState :: (Eq o, Hashable o, RealFloat v) => [o] -> Int -> Int -> [Int] -> State o v
initialState owners playerCount weekCount rosterStates = State{pick=0, selected=0, remaining=initialRemaining, caches=initialCache}
  where
    initialRemaining = 2 ^ playerCount - 1
    initialCache = initialRewardCaches owners weekCount rosterStates

currentOwner :: Bool -> Int -> Array Int o -> State o v -> o
currentOwner isSnake ownerCount ownerFromIndex state = ownerFromIndex Data.Array.! ownerIndex
  where
    roundPick = pick state `rem` ownerCount
    isForward = even $ pick state `quot` ownerCount
    ownerIndex = if not isSnake || isForward then roundPick else ownerCount - roundPick - 1

--reward :: (Eq o, Hashable o, RealFloat v, Show o, Show v) => (State o v -> o) -> Array Int [v] -> Array Int [Int] -> Int -> State o v -> Int -> o -> v
--reward currentOwnerFromState playerProjections playerRosterIndices fullProblem state player owner | trace ("reward " ++ show state ++ " " ++ show player ++ " " ++ show owner) False = undefined
reward :: (Eq o, Hashable o, RealFloat v) => (State o v -> o) -> Array Int [v] -> Array Int [Int] -> Int -> State o v -> Int -> o -> v
reward currentOwnerFromState playerProjections playerRosterIndices fullProblem state player owner
    | owner /= currentOwnerFromState state = 0
    | otherwise = sum [weeklyReward fullProblem rosterIndices player p c | (p, c) <- weeks]
        where
          rosterIndices = playerRosterIndices Data.Array.! player
          projections = playerProjections Data.Array.! player
          weeks = zip projections (caches state Data.HashMap.Strict.! owner)

nextState :: (Eq o, Hashable o, RealFloat v) => (State o v -> o) -> HashMap o Int -> Int -> Array Int [v] -> Array Int [Int] -> [Int] -> State o v -> Int -> State o v
nextState currentOwnerFromState ownerToIndex playerCount playerProjections playerRosterIndices rosterStates state player = State nextPick nextSelected nextRemaining nextCaches
  where
    nextPick = pick state + 1
    owner = currentOwnerFromState state
    currentOwnerIndex = ownerToIndex Data.HashMap.Strict.! owner
    playerBit = currentOwnerIndex * playerCount + player
    nextSelected = setBit (selected state) playerBit
    nextRemaining = clearBit (remaining state) player -- no shift
    nextCaches = Data.HashMap.Strict.insert owner newOwnerCache priorCaches
      where
        priorCaches = caches state
        rosterIndices = playerRosterIndices Data.Array.! player
        projections = playerProjections Data.Array.! player
        weeks = zip projections (caches state Data.HashMap.Strict.! owner)
        newOwnerCache = [weeklyOwnerCache rosterStates rosterIndices player p c | (p, c) <- weeks]

-- helper functions

initialRewardCaches :: (Eq o, Hashable o, RealFloat v) => [o] -> Int -> [Int] -> HashMap o [IntMap (Selection v)]
initialRewardCaches owners weekCount rosterStates = Data.HashMap.Strict.fromList [(o, initialCache) | o<-owners]
  where
    initialCache = initialOwnerCache weekCount rosterStates

initialOwnerCache :: (RealFloat v) => Int -> [Int] -> [IntMap (Selection v)]
initialOwnerCache weekCount rosterStates = replicate weekCount initialCache
  where
    initialCache = initialWeeklyRewardCache rosterStates

initialWeeklyRewardCache :: (RealFloat v) => [Int] -> IntMap (Selection v)
initialWeeklyRewardCache = foldl' (\t s->Data.IntMap.Strict.insert s (Selection 0.0 0) t) Data.IntMap.Strict.empty

rosterInts :: [(r, Int)] -> [Int]
rosterInts startingRosterSpots = map digitsToInt digits
  where
    digits = sequence [[n, n-1..0] | n <- map snd startingRosterSpots]

digitsToInt :: [Int] -> Int
digitsToInt d = sum $ zipWith (\e p->e*10^p) d [n, n-1..0]
  where
    n = length d - 1

setBits :: Integer -> [Int]
setBits 0 = []
setBits n = popCount (b - 1) : setBits (n `xor` b)
  where
    b = n .&. (complement n + 1)

positionsToRosterSpotIndices :: (Eq p, Hashable p, Eq r, Hashable r) => [(p, [r])] -> HashMap r Int -> [p] -> [Int]
positionsToRosterSpotIndices positionToRosterSpots rosterSpotToIndex positions = setBits bitSet
  where
    pToRSMap = Data.HashMap.Strict.fromList positionToRosterSpots
    bitSet = foldl' (\b p->bitSetFromPosition pToRSMap rosterSpotToIndex p .|. b) 0 positions

bitSetFromPosition :: (Eq p, Hashable p, Eq r, Hashable r) => HashMap p [r] -> HashMap r Int -> p -> Integer
bitSetFromPosition positionToRosterSpot rosterSpotToIndex position = bitSet
  where
    rosterSpots = positionToRosterSpot Data.HashMap.Strict.! position
    bitSet = foldl' (\b r->bitSetFromRosterSpot rosterSpotToIndex r .|. b) 0 rosterSpots

bitSetFromRosterSpot :: (Eq r, Hashable r) => HashMap r Int -> r -> Integer
bitSetFromRosterSpot rosterSpotToIndex rosterSpot = bit (rosterSpotToIndex Data.HashMap.Strict.! rosterSpot)

--weeklyReward :: (RealFloat v, Show v) => Int -> [Int] -> Int -> v -> IntMap (Selection v) -> v
--weeklyReward fullProblem rosterIndices player projection cache | trace ("weeklyReward " ++ show fullProblem ++ " " ++ show rosterIndices ++ " " ++ show player ++ " " ++ show projection) False = undefined
weeklyReward :: (RealFloat v) => Int -> [Int] -> Int -> v -> IntMap (Selection v) -> v
weeklyReward fullProblem rosterIndices player projection cache = optimalValue - priorValue
  where
    (Selection priorValue _) = cache Data.IntMap.Strict.! fullProblem
    (Selection optimalValue _) = optimalRosterAtState rosterIndices player projection cache fullProblem

--weeklyOwnerCache :: (RealFloat v) => [Int] -> [Int] -> Int -> v -> IntMap (Selection v) -> IntMap (Selection v)
--weeklyOwnerCache rosterStates rosterIndices player projection cache = Data.IntMap.Strict.fromList optimalRosters
--  where
--    optimalRosters = [(state, optimalRosterAtState rosterIndices player projection cache state) | state <- rosterStates]

weeklyOwnerCache :: (RealFloat v) => [Int] -> [Int] -> Int -> v -> IntMap (Selection v) -> IntMap (Selection v)
weeklyOwnerCache rosterStates rosterIndices player projection cache = foldl' updateState cache rosterStates
  where
    updateState c s = Data.IntMap.Strict.insert s (optimalRosterAtState rosterIndices player projection c s) c

optimalRosterAtState :: (RealFloat v) => [Int] -> Int -> v -> IntMap (Selection v) -> Int -> Selection v
optimalRosterAtState rosterIndices player projection cache state = foldl' (\p r->max p (selectPlayer r)) prior validRosterIndices
  where
    prior = cache Data.IntMap.Strict.! state
    validRosterIndices = filter (isValidAssignment state) rosterIndices
    selectPlayer r = selection r player projection cache state

selection :: (RealFloat v) => Int -> Int -> v -> IntMap (Selection v) -> Int -> Selection v
selection rosterIndex player projection cache state = Selection newValue newSelections
  where
    rosterSpotValue = 10 ^ rosterIndex
    priorState = state - rosterSpotValue
    (Selection priorValue priorSelections) = cache Data.IntMap.Strict.! priorState
    newValue = priorValue + projection
    newSelections = priorSelections .|. bit player

isValidAssignment :: Int -> Int -> Bool
isValidAssignment rosterState rosterIndex = digitValue rosterState rosterIndex > 0

digitValue :: Int -> Int -> Int
digitValue n i = (n `quot` (10 ^ i)) `rem` 10

Edited to add full source repository

I'm having trouble extracting a minimal example that replicates the behavior, so I'm just posting the full repository.

BitBucket repo here. Once built, when in gameshs, stack exec draftsim-mcoffq-exe app/draftsim.json 13 3 replicates the behavior.

like image 495
MikeRand Avatar asked Jun 22 '21 13:06

MikeRand


1 Answers

Without a full working example, it's hard to pin anything down for sure.

The most likely problem is that, as far as I can see, your strict operations are only going to force tuples (v, Integer) to weak head normal form, but forcing a tuple only forces the tuple constructor, while the v and Integer themselves remain unevaluated.

So, as a very first pass, try doing some forcing of values in selection:

selection :: (RealFloat v) => Int -> Int -> v -> IntMap (v, Integer) -> Int -> (v, Integer)
selection rosterIndex player projection cache state = (newValue, newSelections)
  where
    rosterSpotValue = 10 ^ rosterIndex
    priorState = state - rosterSpotValue
    (priorValue, priorSelections) = cache Data.IntMap.Strict.! priorState
    !newValue = priorValue + projection
    !newSelections = priorSelections .|. bit player

(Note the bang patterns in the last two lines.)

If that doesn't work, as a second pass pass, try replacing your tuple type (v, Integer) with a strict version:

data Selection v = Selection !v !Integer

and see if that fixes the problem.

like image 123
K. A. Buhr Avatar answered Sep 30 '22 15:09

K. A. Buhr