I am digging into F# source code recently.
in Seq.fs:
// Binding.
//
// We use a type defintion to apply a local dynamic optimization.
// We automatically right-associate binding, i.e. push the continuations to the right.
// That is, bindG (bindG G1 cont1) cont2 --> bindG G1 (cont1 o cont2)
// This makes constructs such as the following linear rather than quadratic:
//
// let rec rwalk n = { if n > 0 then
// yield! rwalk (n-1)
// yield n }
After seeing the above code, I tested two code:
let rec rwalk n = seq { if n > 0 then
yield n
yield! rwalk (n-1)
}
and
let rec rwalk n = seq { if n > 0 then
yield! rwalk (n-1)
yield n
}
I found the first one is very fast, while the second is very slow. If n = 10000, it costs 3 seconds on my machine to generate this sequence, thus quadratic time.
The quadratic time is reasonable, as e.g.
seq { yield! {1; 2; ...; n-1}; yield n }
translates to
Seq.append {1; 2; ...; n-1} {n}
This append operation should take linear time, I guess. While in the first code, the append operation is like this: seq { yield n; yield! {n-1; n-2; ...; 1} }
, which costs constant time.
The the comments in code say that it is linear
(maybe this linear is not linear time). Maybe this linear
relates to using customized implementation for sequence rather than Moand/F# computation expression (as mentioned in F# specification, however the specification does not mention the reason for doing so...).
Could anyone clarify the fuzziness here? Thanks a lot!
(p.s. because this is a language design and optimization problem, I also attached Haskell tag to see if people there have insights. )
When yield!
appears in a non-tail-call position, it essentiall means the same thing as:
for v in <expr> do yield v
The problem with this (and the reason why is that quadratic) is that for recursive calls, this creates a chain of iterators with nested for
loops. You need to iterate over the whole sequence generated by <expr>
for every single element, so if the iteration is linear, you get a quadratic time (because the linear iteration happens for every element).
Let's say the rwalk
function generates [ 9; 2; 3; 7 ]
. In the first iteration, the recursively generated sequence has 4 elements, so you'd iterate over 4 elements and add 1. In the recursive call, you'd iterate over 3 elements and add 1, etc.. Using a diagram, you can see how that's quadratic:
x
x x
x x x
x x x x
Also, each of the recursive calls creates a new instance of object (IEnumerator
) so there is also some memory cost (although only linear).
In a tail-call position, the F# compiler/librar does an optimization. It "replaces" the current IEnumerable
with the one returned by the recursive call, so it doesn't need to iterate overe it to generate all elements - it is simply returned (and this also removes the memory cost).
Related. The same problem has been discussed in the C# lanaugage design and there is an interesting paper about it (their name for yield!
is yield foreach
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With