Disclaim: This is micro-benchmark, please do not comment quotes such as "premature optimization is evil" if you feel unhappy about the topic.
Examples are release targeted for x64, .Net4.5 Visual Studio 2012 F# 3.0 and run in windows 7 x64
After profiling, I narrowed down the bottleneck of one of my applications, so that I want to raise this question:
If there is no loop inside for in
loop or Seq.iter
, then it is clear they are both of similar speed. (update2 vs update4)
If there is a loop inside for in
loop or Seq.iter
, it seems Seq.iter
is 2x as faster as for in
. (update vs update3) strange? (if run in fsi they would be similar)
If it is targeted for anycpu and run in x64, there is no difference in time. So the question becomes: Seq.iter (update3) would boost up 2x speed if target is x64
update: 00:00:11.4250483 // 2x as much as update3, why?
updatae2: 00:00:01.4447233
updatae3: 00:00:06.0863791
updatae4: 00:00:01.4939535
open System.Diagnostics
open System
[<EntryPoint>]
let main argv =
let pool = seq {1 .. 1000000}
let ret = Array.zeroCreate 100
let update pool =
for x in pool do
for y in 1 .. 200 do
ret.[2] <- x + y
let update2 pool =
for x in pool do
//for y in 1 .. 100 do
ret.[2] <- x
let update3 pool =
pool
|> Seq.iter (fun x ->
for y in 1 .. 200 do
ret.[2] <- x + y)
let update4 pool =
pool
|> Seq.iter (fun x ->
//for y in 1 .. 100 do
ret.[2] <- x)
let test n =
let run = match n with
| 1 -> update
| 2 -> update2
| 3 -> update3
| 4 -> update4
for i in 1 .. 50 do
run pool
let sw = new Stopwatch()
sw.Start()
test(1)
sw.Stop()
Console.WriteLine(sw.Elapsed);
sw.Restart()
test(2)
sw.Stop()
Console.WriteLine(sw.Elapsed)
sw.Restart()
test(3)
sw.Stop()
Console.WriteLine(sw.Elapsed)
sw.Restart()
test(4)
sw.Stop()
Console.WriteLine(sw.Elapsed)
0 // return an integer exit code
This isn't a complete answer, but hope it helps you to go further.
I can reproduce the behaviour using the same configuration. Here is a simpler example for profiling:
open System
let test1() =
let ret = Array.zeroCreate 100
let pool = {1 .. 1000000}
for x in pool do
for _ in 1..50 do
for y in 1..200 do
ret.[2] <- x + y
let test2() =
let ret = Array.zeroCreate 100
let pool = {1 .. 1000000}
Seq.iter (fun x ->
for _ in 1..50 do
for y in 1..200 do
ret.[2] <- x + y) pool
let time f =
let sw = new Diagnostics.Stopwatch()
sw.Start()
let result = f()
sw.Stop()
Console.WriteLine(sw.Elapsed)
result
[<EntryPoint>]
let main argv =
time test1
time test2
0
In this example, Seq.iter
and for x in pool
is executed once but there is still 2x time difference between test1
and test2
:
00:00:06.9264843
00:00:03.6834886
Their ILs are very similar, so compiler optimization isn't a problem. It seems that x64 jitter fails to optimize test1
though it is able to do so with test2
. Interestingly, if I refactor nested for loops in test1
as a function, JIT optimization succeeds again:
let body (ret: _ []) x =
for _ in 1..50 do
for y in 1..200 do
ret.[2] <- x + y
let test3() =
let ret = Array.zeroCreate 100
let pool = {1..1000000}
for x in pool do
body ret x
// 00:00:03.7012302
When I disable JIT optimization using the technique described here, execution times of these functions are comparable.
Why x64 jitter fails in the particular example, I don't know. You can disassemble optimized jitted code to compare ASM instructions line by line. Maybe someone with good ASM knowledge can find out their differences.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With