[For Clarify my question]: I want to know why shorthanded code let performance down when I using the State Monad. This question is not for optimizing my code. This code is for benchmarking of the stateful monads. You can see the
State
-free pure version andST
monad version code and benchmark result from the master branch of my repo.
When I try to shorthand in my code, I've got an unexpected result by the shorthanded function. (Please refer focused benchmark and overall benchmark
In LazinessTest branch of this repository,
I tried to shorthand this code,
runTimeSlot' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot' (target : idx : rest) inst operand = do
d <- get
case inst of
-- this code ↓↓↓↓
0 -> case (rem target sizeOfTarget) of
0 -> state $ \s -> ((idx : rest), setTime operand d)
...
1 -> case (rem target sizeOfTarget) of
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
...
as like as following
runTimeSlot'' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'' (target : idx : rest) inst operand = do
d <- get
case inst of
-- as like as ↓↓↓↓
0 -> case targetInData of
0 -> state $ \s -> ((idx : rest), setTime operand d)
...
1 -> case targetInData of
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
...
where targetInData = rem target sizeOfTarget
And it shows that drastically performance down from 622 μs to 1.767 ms.
Even the value targetInData
would be evaluated in just next step case
,
I thought that I may figure it by making the targetInData
strictly like
runTimeSlot''' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''' (target : idx : rest) inst operand = do
d <- get
-- evaluate it ↓↓ here before it used
targetInData `seq` case inst of
0 -> case targetInData of
0 -> state $ \s -> ((idx : rest), setTime operand d)
...
1 -> case targetInData of
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
...
where targetInData = rem target sizeOfTarget
But this also does not work. (Takes 1.758 ms)
Based on @AndrásKovács 's comment (Thanks, @AndrásKovács)
I've added BangPatterns
to targetInData
as like as
runTimeSlot''b :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''b (target : idx : rest) inst operand = do
d <- get
case inst of
0 -> case targetInData of -- Set
0 -> state $ \s -> ((idx : rest), setTime operand d)
...
1 -> case targetInData of -- Mod
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
...
where !targetInData = rem target sizeOfTarget
runTimeSlot'''b :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'''b (target : idx : rest) inst operand = do
d <- get
-- evaluate it ↓↓ here before it used
targetInData `seq` case inst of
0 -> case targetInData of
0 -> state $ \s -> ((idx : rest), setTime operand d)
...
1 -> case targetInData of
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
...
where !targetInData = rem target sizeOfTarget
And it helps a little, but it does not solves the unexpected situation completely.
runTimeSlot''
-> 1.527 ms @ runTimeSlot''b
runTimeSlot'''
-> 1.503 ms @ runTimeSlot'''b
Why not as like as 622 μs @ runTimeSlot
???
I couldn't explain this situation with laziness by myself.
Could you explain why just replacing (rem target sizeOfTaregt)
as shorthand code makes the performance worse?
Here is single compilable example code and benchmark result: (I'm sorry that I couldn't reduce unnecessary codes enough)
-- dependencies: base, containers, criterion, deepseq, mtl, splitmix
{-# LANGUAGE BangPatterns #-}
module Main where
import Criterion.Main
import Criterion.Types
import Control.DeepSeq
import Control.Monad.State.Strict
import Data.Bifunctor
import Data.Maybe
import qualified Data.IntMap as IM
import Data.List
import System.Random.SplitMix
myConfig60s =
defaultConfig { timeLimit = 60.0, resamples = 10000, verbosity = Verbose }
randomInts :: SMGen -> [Int]
randomInts = unfoldr (Just . (first fromIntegral . bitmaskWithRejection64 64))
main :: IO ()
main = do
putStrLn "Initialize"
let size = 10000
let instSize = 2
let targetSize = 16
let operandSize = 256
let i0Gen = (mkSMGen . fromIntegral) 0
let (targetGen, i1Gen) = splitSMGen i0Gen
let (instGen, i2Gen) = splitSMGen i1Gen
let (operGen, iGen) = splitSMGen i2Gen
let infTargetList = map (\x -> rem x targetSize) $ randomInts targetGen
let infInstList = map (\x -> rem x instSize) $ randomInts instGen
let infOperandList = map (\x -> rem x operandSize + 1) $ randomInts operGen
let (iTime : iBalance : iStatus : _) = randomInts iGen
let targetList = take (size * 2) infTargetList
let instList = take size infInstList
let operandList = take size infOperandList
targetList `deepseq` instList `deepseq` operandList `deepseq` putStrLn
"Evaluated"
let iData = Data iTime iBalance iStatus IM.empty
let
ssBench =
bgroup "SingleState Simulation"
$ [ bench "SingleState.StrictPure'" $ nf
( runSimulatorPure' size targetList instList operandList
)
iData
, bench "SingleState.StrictPure''" $ nf
( runSimulatorPure'' size targetList instList operandList
)
iData
, bench "SingleState.StrictState'" $ nf
( runState
$ runSimulator' size targetList instList operandList
)
iData
, bench "SingleState.StrictState''" $ nf
( runState
$ runSimulator'' size targetList instList operandList
)
iData
, bench "SingleState.StrictState''b" $ nf
( runState
$ runSimulator''b size targetList instList operandList
)
iData
, bench "SingleState.StrictState'''" $ nf
( runState
$ runSimulator''' size targetList instList operandList
)
iData
, bench "SingleState.StrictState'''b" $ nf
( runState
$ runSimulator'''b size targetList instList operandList
)
iData
, bench "SingleState.StrictState''''" $ nf
( runState
$ runSimulator'''' size targetList instList operandList
)
iData
, bench "SingleState.StrictState'''''" $ nf
( runState
$ runSimulator''''' size targetList instList operandList
)
iData
]
putStrLn "Do bench"
defaultMainWith myConfig60s [ssBench]
-- from SingleState.StrictPure of the repo
runSimulatorPure' :: Int -> [Int] -> [Int] -> [Int] -> Data -> Data
runSimulatorPure' 0 _ _ _ d = d
runSimulatorPure' size tList (i : iList) (o : oList) d =
restTList
`seq` newData
`seq` runSimulatorPure' (size - 1) restTList iList oList newData
where (restTList, newData) = runTimeSlotPure' tList i o d
runTimeSlotPure' :: [Int] -> Int -> Int -> Data -> ([Int], Data)
runTimeSlotPure' (target : idx : rest) inst operand d = case inst of
0 -> case (rem target sizeOfTarget) of -- Set
0 -> ((idx : rest), setTime operand d)
1 -> ((idx : rest), setBalance operand d)
2 -> ((idx : rest), setStatus operand d)
3 -> (rest, setEntry idx operand d)
1 -> case (rem target sizeOfTarget) of -- Mod
0 -> ((idx : rest), modifyTime (\x -> rem x operand) d)
1 -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
2 -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
3 -> (rest, modifyEntry (\x -> rem x operand) idx d)
runSimulatorPure'' :: Int -> [Int] -> [Int] -> [Int] -> Data -> Data
runSimulatorPure'' 0 _ _ _ d = d
runSimulatorPure'' size tList (i : iList) (o : oList) d =
restTList
`seq` newData
`seq` runSimulatorPure'' (size - 1) restTList iList oList newData
where (restTList, newData) = runTimeSlotPure'' tList i o d
runTimeSlotPure'' :: [Int] -> Int -> Int -> Data -> ([Int], Data)
runTimeSlotPure'' (target : idx : rest) inst operand d = case inst of
0 -> case targetInData of -- Set
0 -> ((idx : rest), setTime operand d)
1 -> ((idx : rest), setBalance operand d)
2 -> ((idx : rest), setStatus operand d)
3 -> (rest, setEntry idx operand d)
1 -> case targetInData of -- Mod
0 -> ((idx : rest), modifyTime (\x -> rem x operand) d)
1 -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
2 -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
3 -> (rest, modifyEntry (\x -> rem x operand) idx d)
where targetInData = rem target sizeOfTarget
-- from SingleState.StrictState of the repo
runSimulator :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator 0 _ _ _ = get
runSimulator size tList (i : iList) (o : oList) = do
restTList <- runTimeSlot tList i o
runSimulator (size - 1) restTList iList oList
runTimeSlot :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot (target : idx : rest) inst operand = do
d <- get
case inst of
0 -> case (rem target sizeOfTarget) of -- Set
0 -> state $ \s -> ((idx : rest), setTime operand d)
1 -> state $ \s -> ((idx : rest), setBalance operand d)
2 -> state $ \s -> ((idx : rest), setStatus operand d)
3 -> state $ \s -> (rest, setEntry idx operand d)
1 -> case (rem target sizeOfTarget) of -- Mod
0 -> state $ \s -> ((idx : rest), modifyTime rF d)
1 -> state $ \s -> ((idx : rest), modifyBalance rF d)
2 -> state $ \s -> ((idx : rest), modifyStatus rF d)
3 -> state $ \s -> (rest, modifyEntry rF idx d)
-- 2 -> Add
-- 3 -> Div
where rF x = rem x operand
runSimulator' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator' 0 _ _ _ = get
runSimulator' size tList (i : iList) (o : oList) = do
restTList <- runTimeSlot' tList i o
runSimulator' (size - 1) restTList iList oList
runTimeSlot' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot' (target : idx : rest) inst operand = do
d <- get
case inst of
0 -> case (rem target sizeOfTarget) of -- Set
0 -> state $ \s -> ((idx : rest), setTime operand d)
1 -> state $ \s -> ((idx : rest), setBalance operand d)
2 -> state $ \s -> ((idx : rest), setStatus operand d)
3 -> state $ \s -> (rest, setEntry idx operand d)
1 -> case (rem target sizeOfTarget) of -- Mod
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
-- 2 -> Add
-- 3 -> Div
runSimulator'' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator'' 0 _ _ _ = get
runSimulator'' size tList (i : iList) (o : oList) = do
restTList <- runTimeSlot'' tList i o
runSimulator'' (size - 1) restTList iList oList
runTimeSlot'' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'' (target : idx : rest) inst operand = do
d <- get
case inst of
0 -> case targetInData of -- Set
0 -> state $ \s -> ((idx : rest), setTime operand d)
1 -> state $ \s -> ((idx : rest), setBalance operand d)
2 -> state $ \s -> ((idx : rest), setStatus operand d)
3 -> state $ \s -> (rest, setEntry idx operand d)
1 -> case targetInData of -- Mod
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
-- 2 -> Add
-- 3 -> Div
where targetInData = rem target sizeOfTarget
runSimulator''b :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator''b 0 _ _ _ = get
runSimulator''b size tList (i : iList) (o : oList) = do
restTList <- runTimeSlot''b tList i o
runSimulator''b (size - 1) restTList iList oList
runTimeSlot''b :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''b (target : idx : rest) inst operand = do
d <- get
case inst of
0 -> case targetInData of -- Set
0 -> state $ \s -> ((idx : rest), setTime operand d)
1 -> state $ \s -> ((idx : rest), setBalance operand d)
2 -> state $ \s -> ((idx : rest), setStatus operand d)
3 -> state $ \s -> (rest, setEntry idx operand d)
1 -> case targetInData of -- Mod
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
-- 2 -> Add
-- 3 -> Div
where !targetInData = rem target sizeOfTarget
runSimulator''' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator''' 0 _ _ _ = get
runSimulator''' size tList (i : iList) (o : oList) = do
restTList <- runTimeSlot''' tList i o
runSimulator''' (size - 1) restTList iList oList
runTimeSlot''' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''' (target : idx : rest) inst operand = do
d <- get
targetInData `seq` case inst of
0 -> case targetInData of -- Set
0 -> state $ \s -> ((idx : rest), setTime operand d)
1 -> state $ \s -> ((idx : rest), setBalance operand d)
2 -> state $ \s -> ((idx : rest), setStatus operand d)
3 -> state $ \s -> (rest, setEntry idx operand d)
1 -> case targetInData of -- Mod
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
-- 2 -> Add
-- 3 -> Div
where targetInData = rem target sizeOfTarget
runSimulator'''b :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator'''b 0 _ _ _ = get
runSimulator'''b size tList (i : iList) (o : oList) = do
restTList <- runTimeSlot'''b tList i o
runSimulator'''b (size - 1) restTList iList oList
runTimeSlot'''b :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'''b (target : idx : rest) inst operand = do
d <- get
targetInData `seq` case inst of
0 -> case targetInData of -- Set
0 -> state $ \s -> ((idx : rest), setTime operand d)
1 -> state $ \s -> ((idx : rest), setBalance operand d)
2 -> state $ \s -> ((idx : rest), setStatus operand d)
3 -> state $ \s -> (rest, setEntry idx operand d)
1 -> case targetInData of -- Mod
0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
-- 2 -> Add
-- 3 -> Div
where !targetInData = rem target sizeOfTarget
runSimulator'''' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator'''' 0 _ _ _ = get
runSimulator'''' size tList (i : iList) (o : oList) = do
restTList <- runTimeSlot'''' tList i o
runSimulator'''' (size - 1) restTList iList oList
runTimeSlot'''' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'''' (target : idx : rest) inst operand = do
d <- get
case inst of
0 -> case targetInData of -- Set
0 -> state $ \s -> ((idx : rest), setTime operand d)
1 -> state $ \s -> ((idx : rest), setBalance operand d)
2 -> state $ \s -> ((idx : rest), setStatus operand d)
3 -> state $ \s -> (rest, setEntry idx operand d)
1 -> case targetInData of -- Mod
0 -> state $ \s -> ((idx : rest), modifyTime rF d)
1 -> state $ \s -> ((idx : rest), modifyBalance rF d)
2 -> state $ \s -> ((idx : rest), modifyStatus rF d)
3 -> state $ \s -> (rest, modifyEntry rF idx d)
-- 2 -> Add
-- 3 -> Div
where
targetInData = rem target sizeOfTarget
rF x = rem x operand
runSimulator''''' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator''''' 0 _ _ _ = get
runSimulator''''' size tList (i : iList) (o : oList) = do
restTList <- runTimeSlot''''' tList i o
runSimulator''''' (size - 1) restTList iList oList
runTimeSlot''''' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''''' (target : idx : rest) inst operand = do
d <- get
targetInData `seq` case inst of
0 -> case targetInData of -- Set
0 -> state $ \s -> ((idx : rest), setTime operand d)
1 -> state $ \s -> ((idx : rest), setBalance operand d)
2 -> state $ \s -> ((idx : rest), setStatus operand d)
3 -> state $ \s -> (rest, setEntry idx operand d)
1 -> case targetInData of -- Mod
0 -> state $ \s -> ((idx : rest), modifyTime rF d)
1 -> state $ \s -> ((idx : rest), modifyBalance rF d)
2 -> state $ \s -> ((idx : rest), modifyStatus rF d)
3 -> state $ \s -> (rest, modifyEntry rF idx d)
-- 2 -> Add
-- 3 -> Div
where
targetInData = rem target sizeOfTarget
rF x = rem x operand
type Balance = Int
type Time = Int
type Status = Int
type Idx = Int
type Datum = Int
data Data = Data
{ time :: Time
, balance :: Balance
, status :: Status
, aMap :: IM.IntMap Datum
} deriving (Show,Eq)
sizeOfTarget :: Int
sizeOfTarget = 4
instance NFData Data where
rnf (Data t b s m) = rnf t `seq` rnf b `seq` rnf s `seq` rnf m
getTime = time
getBalance = balance
getStatus = status
getEntry idx = fromMaybe 0 . IM.lookup idx . aMap
setTime newTime d = d { time = newTime }
setBalance newBalance d = d { balance = newBalance }
setStatus newStatus d = d { status = newStatus }
setEntry idx aDatum d = d { aMap = IM.insert idx aDatum (aMap d) }
modifyTime f d = d { time = f (time d) }
modifyBalance f d = d { balance = f (balance d) }
modifyStatus f d = d { status = f (status d) }
modifyEntry f idx d = d { aMap = IM.adjust f idx (aMap d) }
Updated
State
-free version and ST
monad version from the master branch
P.S.
-threaded
and with -O2
.As far as I see from looking at Core output via ghc -O2 -ddump-simpl -dsuppress-all
, GHC simply misses State
tuple unboxing and worker-wrapping if we do the following:
runTimeSlot2 :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot2 (target : idx : rest) inst operand = do
d <- get
case inst of
0 -> ...
1 -> ..
where targetInData = rem target sizeOfTarget
But it works in the following case. We can also put targetInData
in a let
before the case
.
runTimeSlot2 :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot2 (target : idx : rest) inst operand = do
d <- get
case inst of
0 -> ...
1 -> ..
where targetInData = rem target sizeOfTarget
What's the reason? I have no idea. But this is an example where we are placing a bit too much trust in GHC anyway, and the program is far from optimal to begin with. First, I'd make Data
strict, and use whnf
in the benchmark instead of nf
:
data Data = Data
{ time :: !Time
, balance :: !Balance
, status :: !Status
, aMap :: !(IM.IntMap Datum)
} deriving (Show,Eq)
Second, I don't think the State
buys us much in this particular example, and we can just write a tail-recursive function:
runSimulator1 :: Int -> [Int] -> [Int] -> [Int] -> Data -> Data
runSimulator1 = go where
go 0 _ _ _ d = d
go size (target : (idx : rest)) (i : iList) (o : oList) d =
let targetInData = rem target sizeOfTarget in
case i of
0 -> case targetInData of
0 -> go (size - 1) (idx : rest) iList oList (setTime o d)
1 -> go (size - 1) (idx : rest) iList oList (setBalance o d)
2 -> go (size - 1) (idx : rest) iList oList (setStatus o d)
3 -> go (size - 1) rest iList oList (setEntry idx o d)
1 -> case targetInData of
0 -> go (size - 1) (idx : rest) iList oList (modifyTime (\x -> rem x o) d)
1 -> go (size - 1) (idx : rest) iList oList (modifyBalance (\x -> rem x o) d)
2 -> go (size - 1) (idx : rest) iList oList (modifyStatus (\x -> rem x o) d)
3 -> go (size - 1) rest iList oList (modifyEntry (\x -> rem x o) idx d)
This runs a bit more than twice as fast on my computer than the better performing variant in your original benchmark.
I note a performance issue in the original code:
...
0 -> case targetInData of
0 -> state $ \s -> ((idx : rest), setTime operand d)
1 -> state $ \s -> ((idx : rest), setBalance operand d)
2 -> state $ \s -> ((idx : rest), setStatus operand d)
3 -> state $ \s -> (rest, setEntry idx operand d)
...
Above, all the returned state like setTime operand d
is lazy. Hence, we get a large number of thunks. We can do instead:
0 -> case targetInData of -- Set
0 -> (idx : rest) <$ (put $! setTime operand d)
1 -> (idx : rest) <$ (put $! setBalance operand d)
2 -> (idx : rest) <$ (put $! setStatus operand d)
3 -> rest <$ (put $! setEntry idx operand d)
This gives us a performance boost, but it still a bit slower than my State
-free version, because GHC can unbox Data
as a plain function argument or result, but it cannot unbox Data
inside the State
tuple.
In general, if you really want to optimize, the most robust solution is pure (non-monadic) and strict plain functions, preferably tail-recursive. It depends on the development situation whether it's worth the effort to optimize to such degree.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With