Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell profiling understanding cost centre summary for anonymous lambda

I'm trying to profile my medium size compiler written in Haskell. When running it on an example file, I get the following profiling data:

vehicle +RTS -po.profiling/vehicle-1706073721 -p -RTS compile --target MarabouQueries --specification vehicle/tests/golden/compile/acasXu/spec.vcl --network acasXu:vehicle/tests/golden/compile/acasXu/fake.onnx

    total time  =        0.72 secs   (716 ticks @ 1000 us, 1 processor)
    total alloc = 1,558,506,104 bytes  (excludes profiling overheads)

COST CENTRE             MODULE                                          SRC                                                                       %time %alloc

fmap                    Vehicle.Prelude.Logging.Backend                 src/Vehicle/Prelude/Logging/Backend.hs:49:13-19                            19.7   21.2
runVehicle.\            Vehicle                                         src/Vehicle.hs:(56,46)-(67,52)                                             12.2   19.3
fmap                    Vehicle.Prelude.Supply                          src/Vehicle/Prelude/Supply.hs:28:13-19                                      9.1    8.2
compileDecls            Vehicle.Backend.Queries                         src/Vehicle/Backend/Queries.hs:(87,1)-(101,47)                              5.7    7.2
$mRelevantExplicitArg.\ Vehicle.Syntax.AST.Arg                          src/Vehicle/Syntax/AST/Arg.hs:51:36-60                                      5.6    5.4
>>=                     Vehicle.Prelude.Logging.Backend                 src/Vehicle/Prelude/Logging/Backend.hs:49:35-39                             4.3    2.9
fmap                    Vehicle.Compile.Context.Free.Instance           src/Vehicle/Compile/Context/Free/Instance.hs:24:13-19                       4.3    4.2
>>=                     Vehicle.Prelude.Supply                          src/Vehicle/Prelude/Supply.hs:28:35-39                                      3.2    2.9
>>=                     Vehicle.Prelude.Logging.Instance                src/Vehicle/Prelude/Logging/Instance.hs:49:35-39                            2.8    4.4
pure                    Vehicle.Prelude.Supply                          src/Vehicle/Prelude/Supply.hs:28:22-32                                      2.2    0.7
return                  Vehicle.Prelude.Logging.Backend                 src/Vehicle/Prelude/Logging/Backend.hs:49:35-39                             2.2    3.6
return                  Vehicle.Compile.Context.Free.Instance           src/Vehicle/Compile/Context/Free/Instance.hs:24:35-39                       2.2    1.5

Looking at the second entry, the function runVehicle (which is directly called from my main function) is as follows:

runVehicle :: (MonadStdIO IO) => Options -> IO ()
runVehicle Options {..} = do
  withLogger globalOptions $ \logSettings -> do
    case modeOptions of
      Nothing ->
        fatalError
          "No mode provided. Please use one of 'typeCheck', 'compile', 'verify', 'check', 'export'"
      Just mode -> case mode of
        Check options -> typeCheck logSettings options
        Compile options -> compile logSettings options
        Verify options -> verify logSettings options
        Validate options -> validate logSettings options
        Export options -> export logSettings options

My understanding is that the cost-centre denotes the time spent executing the function, excluding the time spent executing the functions it calls. The entry in the cost centre summaries seems to suggest that 12.2% of the time is spent executing the anonymous lambda.

I don't understand how that can be the case, as all it's doing is a simple case split. Does anyone have a good explanation for what's going on?

like image 889
user2667523 Avatar asked Oct 11 '25 15:10

user2667523


2 Answers

My understanding is that the cost-centre denotes the time spent executing the function, excluding the time spent executing the functions it calls.

I don't think this is correct. I think a more correct version is that the time assigned to a cost center denotes time spent executing that cost center, excluding the time assigned to cost centers it calls. If there are no functions it calls that are themselves cost centers, then the parent gets assigned all the cost. Perhaps that is what is happening to you, i.e., perhaps typeCheck, compile, and friends are not cost centers.

like image 52
Daniel Wagner Avatar answered Oct 14 '25 10:10

Daniel Wagner


The most likely cause is that one of the functions doing the actual work has not been assigned a cost center. Be warned that if you are using -fprof-auto to assign cost centers (a common default when Cabal or Stack is managing the profiling), then functions marked INLINE are not assigned cost centers by default.

For example, if the following program:

foo :: Int
{-# INLINE foo #-}
foo = sum [1..10000000000]

bar :: Maybe Int -> IO ()
bar = \mx -> case mx of
  Nothing -> print foo
  Just x -> print (x + foo)

main = bar (Just 5)

is compiled and run under Stack with:

stack build --profile
stack exec --profile main -- +RTS -p

then the profile will attribute the cost of the calculation in the inlined function foo to the anonymous lambda:

    Wed Jan 24 13:55 2024 Time and Allocation Profiling Report  (Final)

       main +RTS -p -RTS

    total time  =        5.00 secs   (5004 ticks @ 1000 us, 1 processor)
    total alloc =      49,488 bytes  (excludes profiling overheads)

COST CENTRE MODULE           SRC                       %time %alloc

bar.\       Main             Profile.hs:(6,14)-(8,27)  100.0   21.9
MAIN        MAIN             <built-in>                  0.0    1.3
CAF         GHC.IO.Handle.FD <entire-module>             0.0   70.2
CAF         GHC.IO.Encoding  <entire-module>             0.0    4.9
CAF         GHC.Conc.Signal  <entire-module>             0.0    1.3

Without the INLINE directive, costs are attributed as expected:

    Wed Jan 24 13:58 2024 Time and Allocation Profiling Report  (Final)

       main +RTS -p -RTS

    total time  =        4.89 secs   (4895 ticks @ 1000 us, 1 processor)
    total alloc =      49,488 bytes  (excludes profiling overheads)

COST CENTRE MODULE           SRC                       %time %alloc

foo         Main             Profile.hs:2:1-26         100.0    0.0
CAF         GHC.IO.Encoding  <entire-module>             0.0    4.9
MAIN        MAIN             <built-in>                  0.0    1.3
CAF         GHC.IO.Handle.FD <entire-module>             0.0   70.2
CAF         GHC.Conc.Signal  <entire-module>             0.0    1.3
bar.\       Main             Profile.hs:(5,14)-(7,27)    0.0   21.8
like image 39
K. A. Buhr Avatar answered Oct 14 '25 11:10

K. A. Buhr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!