The code:
#time "on"
let newVector = [|
for v in 1..10000000 ->
v |]
let newVector2 =
let a = Array.zeroCreate 10000000
for v in 1..10000000 do
a.[v-1] <- v
a
let newVector3 =
let a = System.Collections.Generic.List() // do not set capacity
for v in 1..10000000 do
a.Add(v)
a.ToArray()
gives timing in FSI:
--> Timing now on
>
Real: 00:00:01.121, CPU: 00:00:01.156, GC gen0: 4, gen1: 4, gen2: 4
val newVector : int [] = [|1; 2; 3; 4; ...|]
>
Real: 00:00:00.024, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0
val newVector2 : int32 [] = [|1; 2; 3; 4; ...|]
>
Real: 00:00:00.173, CPU: 00:00:00.156, GC gen0: 2, gen1: 2, gen2: 2
val newVector3 : int32 [] = [|1; 2; 3; 4; ...|]
Not a big difference in a standalone app, release mode, without debug, 5 runs average:
The third method doesn't know the original capacity but is still almost 7x faster. Why sequence expressions for arrays are so slow in F#?
Update:
let seq = seq { for v in 1..10000000 do yield v }
let seqArr = seq |> Seq.toArray
Real: 00:00:01.060, CPU: 00:00:01.078, GC gen0: 2, gen1: 2, gen2: 2
let newVector4 =
let a = System.Collections.Generic.List() // do not set capacity
for v in seq do
a.Add(v)
a.ToArray()
Real: 00:00:01.119, CPU: 00:00:01.109, GC gen0: 1, gen1: 1, gen2: 1
open System.Linq
let newVector5 = seq.ToArray()
Real: 00:00:00.969, CPU: 00:00:00.968, GC gen0: 0, gen1: 0, gen2: 0
this gives the same timing as the first one and doesn't depend on GC. So the real question is, why enumerating 1..10000000
is so much slower that for
loop in the second and the third case?
Update 2
open System
open System.Linq
open System.Collections.Generic
let newVector5 = seq.ToArray()
let ie count =
{ new IEnumerable<int> with
member x.GetEnumerator(): Collections.IEnumerator = x.GetEnumerator() :> Collections.IEnumerator
member x.GetEnumerator(): IEnumerator<int> =
let c = ref 0
{ new IEnumerator<int> with
member y.MoveNext() =
if !c < count then
c := !c + 1
true
else false
member y.Current with get() = !c + 1
member y.Current with get() = !c + 1 :> obj
member y.Dispose() = ()
member y.Reset() = ()
}
}
let newVector6 =
let a = System.Collections.Generic.List() // do not set capacity
for v in ie 10000000 do
a.Add(v)
a.ToArray()
Real: 00:00:00.185, CPU: 00:00:00.187, GC gen0: 1, gen1: 1, gen2: 1
Manual implementation of IEnumerable is equivalent to for
loop. I wonder why expansion of lo..hi
should be so much slower for a generic case. It could be implemented by method overloads at least for the most common types.
In cases like this I always check the generated code using one of the many good decompilers in .NET.
let explicitArray () =
let a = Array.zeroCreate count
for v in 1..count do
a.[v-1] <- v
a
This is compiled into the equivalent C#:
public static int[] explicitArray()
{
int[] a = ArrayModule.ZeroCreate<int>(10000000);
for (int v = 1; v < 10000001; v++)
{
a[v - 1] = v;
}
return a;
}
This is about as efficient it gets.
let arrayExpression () =
[| for v in 1..count -> v |]
This on the other hand becomes:
public static int[] arrayExpression()
{
return SeqModule.ToArray<int>(new Program.arrayExpression@7(0, null, 0, 0));
}
This equivalent to:
let arrayExpression () =
let e = seq { for v in 1..count -> v }
let a = List() // do not set capacity
for v in e do
a.Add(v)
a.ToArray()
When iterating over a seq
(alias for IEnumerable
) first MoveNext
is called, then Current
. These are virtual calls which the JIT:er seldomly can inline. Checking the JIT:ed assembly code we see this:
mov rax,qword ptr [rbp+10h]
cmp byte ptr [rax],0
mov rcx,qword ptr [rbp+10h]
lea r11,[7FFC07830030h]
# virtual call .MoveNext
call qword ptr [7FFC07830030h]
movzx ecx,al
# if .MoveNext returns false then exit
test ecx,ecx
je 00007FFC079408A0
mov rcx,qword ptr [rbp+10h]
lea r11,[7FFC07830038h]
# virtual call .Current
call qword ptr [7FFC07830038h]
mov edx,eax
mov rcx,rdi
# call .Add
call 00007FFC65C8B300
# loop
jmp 00007FFC07940863
If we compare this with the JIT:ed code for the code that used ResizeArray
(List
)
lea edx,[rdi-1]
mov rcx,rbx
# call .Add
call 00007FFC65C8B300
mov edx,edi
mov rcx,rbx
# call .Add
call 00007FFC65C8B300
lea edx,[rdi+1]
mov rcx,rbx
# call .Add
call 00007FFC65C8B300
lea edx,[rdi+2]
mov rcx,rbx
# call .Add
call 00007FFC65C8B300
add edi,4
# loop
cmp edi,989682h
jl 00007FFC07910384
Here the JIT:er has unrolled the loop 4 times here and we have only non-virtual calls to List.Add
.
This explains why F# array expressions are slower than your other two examples.
In order to fix this one would have to fix the optimizer in F# to recognize the shape of expressions such as this:
seq { for v in 1..count -> v } |> Seq.toArray
And optimize them into:
let a = Array.zeroCreate count
for v in 1..count do
a.[v-1] <- v
a
The challenge is to find an optimization that is general enough to be useful but also doesn't break the semantics of F#.
For grins, I dumped your array comprehension into ILDASM to get a look at what happens. I put the let
into main
and got this:
.locals init ([0] int32[] newVector,
[1] class [FSharp.Core]Microsoft.FSharp.Core.FSharpFunc`2<string[],class [FSharp.Core]Microsoft.FSharp.Core.Unit> V_1,
[2] string[] V_2)
IL_0000: nop
IL_0001: ldc.i4.0
IL_0002: ldnull
IL_0003: ldc.i4.0
IL_0004: ldc.i4.0
IL_0005: newobj instance void Program/newVector@11::.ctor(int32,
class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>,
int32,
int32)
IL_000a: call !!0[] [FSharp.Core]Microsoft.FSharp.Collections.SeqModule::ToArray<int32>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>)
IL_000f: stloc.0
So an instance of newVector@11 gets created and that class inherits from GeneratedSequenceBase, which itself, implements IENumerable. This makes sense because next there is a call to Seq.ToArray. Looking at that class, there is no way to determine whether or not the length of the sequence is known, even though it is knowable in this case, because of the nature of IEnumerable. This tells me that it should be equivalent to:
let seqArr = seq { for v in 1..10000000 do yield v } |> Seq.toArray
and the timing bears this out:
Real: 00:00:01.032, CPU: 00:00:01.060, GC gen0: 4, gen1: 4, gen2: 4
for a final bit of fun, I put the above sequence comprehension through ILDASM and the wrapper class and GenerateNext method within it are instruction for instruction identical to the array comprehension. So I think that it is very safe to conclude that any array comprehension of the form:
let arr = [| sequence-expr |]
is 100% equivalent to:
let arr = seq { sequence-expr } |> Seq.toArray
And this is hinted at in the F# array documentation with the following line:
You can also use sequence expressions to create arrays. Following is an example that creates an array of squares of integers from 1 to 10.
which is really saying "it's a sequence, not its own thing."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With