Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

F# Performance Impact of Checked Calcs?

Is there a performance impact from using the Checked module? I've tested it out with sequences of type int and see no noticeable difference. Sometimes the checked version is faster and sometimes unchecked is faster, but generally not by much.

Seq.initInfinite (fun x-> x) |> Seq.item 1000000000;;
Real: 00:00:05.272, CPU: 00:00:05.272, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 1000000000
open Checked

Seq.initInfinite (fun x-> x) |> Seq.item 1000000000;;
Real: 00:00:04.785, CPU: 00:00:04.773, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 1000000000

Basically I'm trying to figure out if there would be any downside to always opening Checked. (I encountered an overflow that wasn't immediately obvious, so I'm now playing the role of the jilted lover who doesn't want another broken heart.) The only non-contrived reason I can come up with for not always using Checked is if there were some performance hit, but I haven't seen one yet.

like image 202
Talmage Avatar asked Oct 17 '22 14:10

Talmage


1 Answers

When you measure performance it's usually not a good idea to include Seq as Seq adds lots of overhead (at least compared to int operations) so you risk that most of the time is spent in Seq, not in the code you like to test.

I wrote a small test program for (+):

let clock = 
  let sw = System.Diagnostics.Stopwatch ()
  sw.Start ()
  fun () ->
    sw.ElapsedMilliseconds

let dbreak () = System.Diagnostics.Debugger.Break ()

let time a =
  let b = clock ()
  let r = a ()
  let n = clock ()
  let d = n - b
  d, r

module Unchecked =
  let run c () =
    let rec loop a i =
      if i < c then
        loop (a + 1) (i + 1)
      else
        a
    loop 0 0

module Checked =
  open Checked

  let run c () =
    let rec loop a i =
      if i < c then
        loop (a + 1) (i + 1)
      else
        a
    loop 0 0

[<EntryPoint>]
let main argv =
  let count     = 1000000000
  let testCases =
    [|
      "Unchecked" , Unchecked.run
      "Checked"   , Checked.run
    |]

  for nm, a in testCases do
    printfn "Running %s ..." nm
    let ms, r = time (a count)
    printfn "... it took %d ms, result is %A" ms r

  0

The performance results are this:

Running Unchecked ...
... it took 561 ms, result is 1000000000
Running Checked ...
... it took 1103 ms, result is 1000000000

So it seems some overhead is added by using Checked. The cost of int add should be less than the loop overhead so the overhead of Checked is higher than 2x maybe closer to 4x.

Out of curiousity we can check the IL Code using tools like ILSpy:

Unchecked:

    IL_0000: nop
    IL_0001: ldarg.2
    IL_0002: ldarg.0
    IL_0003: bge.s IL_0014

    IL_0005: ldarg.0
    IL_0006: ldarg.1
    IL_0007: ldc.i4.1
    IL_0008: add
    IL_0009: ldarg.2
    IL_000a: ldc.i4.1
    IL_000b: add
    IL_000c: starg.s i
    IL_000e: starg.s a
    IL_0010: starg.s c
    IL_0012: br.s IL_0000

Checked:

    IL_0000: nop
    IL_0001: ldarg.2
    IL_0002: ldarg.0
    IL_0003: bge.s IL_0014

    IL_0005: ldarg.0
    IL_0006: ldarg.1
    IL_0007: ldc.i4.1
    IL_0008: add.ovf
    IL_0009: ldarg.2
    IL_000a: ldc.i4.1
    IL_000b: add.ovf
    IL_000c: starg.s i
    IL_000e: starg.s a
    IL_0010: starg.s c
    IL_0012: br.s IL_0000

The only difference is that Unchecked uses add and Checked uses add.ovf. add.ovf is add with overflow check.

We can dig even deeper by looking at the jitted x86_64 code.

Unchecked:

; if i < c then
00007FF926A611B3  cmp         esi,ebx  
00007FF926A611B5  jge         00007FF926A611BD  
; i + 1
00007FF926A611B7  inc         esi  
; a + 1
00007FF926A611B9  inc         edi  
; loop (a + 1) (i + 1)
00007FF926A611BB  jmp         00007FF926A611B3

Checked:

; if i < c then
00007FF926A62613  cmp         esi,ebx  
00007FF926A62615  jge         00007FF926A62623  
; a + 1
00007FF926A62617  add         edi,1  
; Overflow?
00007FF926A6261A  jo          00007FF926A6262D  
; i + 1
00007FF926A6261C  add         esi,1  
; Overflow?
00007FF926A6261F  jo          00007FF926A6262D  
; loop (a + 1) (i + 1)
00007FF926A62621  jmp         00007FF926A62613

Now the reason for the Checked overhead is visible. After each operation the jitter inserts the conditional instruction jo which jumps to code that raises OverflowException if the overflow flag is set.

This chart shows us that the cost of an integer add is less than 1 clock cycle. The reason it's less than 1 clock cycle is that modern CPU can execute certain instructions in parallel.

The chart also shows us that branch that was correctly predicted by the CPU takes around 1-2 clock cycles.

So assuming a throughtput of at least 2 the cost of two integer additions in the Unchecked example should be 1 clock cycle.

In the Checked example we do add, jo, add, jo. Most likely CPU can't parallelize in this case and the cost of this should be around 4-6 clock cycles.

Another interesting difference is that the order of additions changed. With checked additions the order of the operations matter but with unchecked the jitter (and the CPU) has a greater flexibility moving the operations possibly improving performance.

So long story short; for cheap operations like (+) the overhead of Checked should be around 4x-6x compared to Unchecked.

This assumes no overflow exception. The cost of a .NET exception is probably around 100,000x times more expensive than an integer addition.

like image 63
Just another metaprogrammer Avatar answered Oct 21 '22 03:10

Just another metaprogrammer