Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow tail recursion in F#

I have an F# function that returns a list of numbers starting from 0 in the pattern of skip n, choose n, skip n, choose n... up to a limit. For example, this function for input 2 will return [2, 3, 6, 7, 10, 11...].

Initially I implemented this as a non-tail-recursive function as below:

let rec indicesForStep start blockSize maxSize =
    match start with
    | i when i > maxSize -> []
    | _ -> [for j in start .. ((min (start + blockSize) maxSize) - 1) -> j] @ indicesForStep (start + 2 * blockSize) blockSize maxSize

Thinking that tail recursion is desirable, I reimplemented it using an accumulator list as follows:

let indicesForStepTail start blockSize maxSize =
    let rec indicesForStepInternal istart accumList =
        match istart with
        | i when i > maxSize -> accumList
        | _ -> indicesForStepInternal (istart + 2 * blockSize) (accumList @ [for j in istart .. ((min (istart + blockSize) maxSize) - 1) -> j])
    indicesForStepInternal start []

However, when I run this in fsi under Mono with the parameters 1, 1 and 20,000 (i.e. should return [1, 3, 5, 7...] up to 20,000), the tail-recursive version is significantly slower than the first version (12 seconds compared to sub-second).

Why is the tail-recursive version slower? Is it because of the list concatenation? Is it a compiler optimisation? Have I actually implemented it tail-recursively?

I also feel as if I should be using higher-order functions to do this, but I'm not sure exactly how to go about doing it.

like image 276
Gnat Avatar asked Nov 29 '22 17:11

Gnat


2 Answers

As dave points out, the problem is that you're using the @ operator to append lists. This is more significant performance issue than tail-recursion. In fact, tail-recursion doesn't really speed-up the program too much (but it makes it work on large inputs where the stack would overflow).

The reason why you'r second version is slower is that you're appending shorter list (the one generated using [...]) to a longer list (accumList). This is slower than appending longer list to a shorter list (because the operation needs to copy the first list).

You can fix it by collecting the elements in the accumulator in a reversed order and then reversing it before returning the result:

let indicesForStepTail start blockSize maxSize =
    let rec indicesForStepInternal istart accumList =
        match istart with
        | i when i > maxSize -> accumList |> List.rev
        | _ -> 
           let acc = 
             [for j in ((min (istart + blockSize) maxSize) - 1) .. -1 .. istart -> j] 
             @ accumList
           indicesForStepInternal (istart + 2 * blockSize) acc
    indicesForStepInternal start []

As you can see, this has the shorter list (generated using [...]) as the first argument to @ and on my machine, it has similar performance to the non-tail-recursive version. Note that the [ ... ] comprehension generates elements in the reversed order - so that they can be reversed back at the end.

You can also write the whole thing more nicely using the F# seq { .. } syntax. You can avoid using the @ operator completely, because it allows you to yield individual elemetns using yield and perform tail-recursive calls using yield!:

let rec indicesForStepSeq start blockSize maxSize = seq {
    match start with
    | i when i > maxSize -> ()
    | _ -> 
      for j in start .. ((min (start + blockSize) maxSize) - 1) do
        yield j
      yield! indicesForStepSeq (start + 2 * blockSize) blockSize maxSize }

This is how I'd write it. When calling it, you just need to add Seq.toList to evaluate the whole lazy sequence. The performance of this version is similar to the first one.

EDIT With the correction from Daniel, the Seq version is actually slightly faster!

like image 166
Tomas Petricek Avatar answered Dec 04 '22 13:12

Tomas Petricek


In F# the list type is implemented as a singly linked list. Because of this you get different performance for x @ y and y @ x if x and y are of different length. That's why your seeing a difference in performance. (x @ y) has running time of X.length.

// e.g.
let x = [1;2;3;4]
let y = [5]

If you did x @ y then x (4 elements) would be copied into a new list and its internal next pointer would be set to the existing y list. If you did y @ x then y (1 element) would be copied into a new list and its next pointer would be set to the existing list x.

I wouldn't use a higher order function to do this. I'd use list comprehension instead.

let indicesForStepTail start blockSize maxSize =
    [ 
        for block in start .. (blockSize * 2) .. (maxSize - 1) do
            for i in block .. (block + blockSize - 1) do
                yield i
    ]
like image 34
gradbot Avatar answered Dec 04 '22 13:12

gradbot