Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tail-recursive merge sort in OCaml

I’m trying to implement a tail-recursive list-sorting function in OCaml, and I’ve come up with the following code:

let tailrec_merge_sort l =
  let split l = 
    let rec _split source left right =
      match source with
        | [] -> (left, right)
        | head :: tail -> _split tail right (head :: left) 
    in _split l [] []
  in

  let merge l1 l2 = 
    let rec _merge l1 l2 result =
      match l1, l2 with
        | [], [] -> result
        | [], h :: t | h :: t, [] -> _merge [] t (h :: result)
        | h1 :: t1, h2 :: t2 ->
            if h1 < h2 then _merge t1 l2 (h1 :: result)
            else            _merge l1 t2 (h2 :: result)
    in List.rev (_merge l1 l2 [])
  in

  let rec sort = function
    | [] -> []
    | [a] -> [a]
    | list -> let left, right = split list in merge (sort left) (sort right)
  in sort l
;;

Yet it seems that it is not actually tail-recursive, since I encounter a "Stack overflow during evaluation (looping recursion?)" error.

Could you please help me spot the non tail-recursive call in this code? I've searched quite a lot, without finding it. Cout it be the let binding in the sort function?

like image 336
Clément Avatar asked Mar 27 '10 14:03

Clément


People also ask

Is merge sort tail recursive?

Your merge-sort is not tail recursive because the last function called in mergesort/3 is merge/3. You call mergesort as arguments of merge so stack has to grow - upper called mergesort/3 is not yet finished and its stack frame can't be reused.

What is a tail recursive method?

(algorithmic technique) Definition: A special form of recursion where the last operation of a function is a recursive call. The recursion may be optimized away by executing the call in the current stack frame and returning its result rather than creating a new stack frame.

How recursive function works in merge sort?

Merge sort is a recursive algorithm that continually splits a list in half. If the list is empty or has one item, it is sorted by definition (the base case). If the list has more than one item, we split the list and recursively invoke a merge sort on both halves.


2 Answers

Merge sort is inherently not tail-recursive. A sort requires two recursive calls, and in any execution of any function, at most one dynamic call can be in tail position. (split is also called from non-tail position, but since it should use constant stack space that should be OK).

Be sure you are using a recent version of OCaml. In versions 3.08 and older, List.rev could blow the stack. This problem is fixed in version 3.10. Using version 3.10.2, I can sort a list of ten million elements using your code. It takes a couple of minutes, but I don't blow the stack. So I'm hoping your problem is simply that you have an old version of OCaml.

If not, a good next step would be to set the environment variable

OCAMLRUNPARAM=b=1

which will give a stack trace when you blow the stack.

I'd also like to know the length of the arrays you are sorting; although merge sort cannot be tail-recursive, its non-tail nature should cost you logarithmic stack space.

If you need to sort more than 10 million elements, maybe you should be looking at a different algorithm? Or if you want, you could CPS-convert mergesort by hand, which will yield a tail-recursive version at the cost of allocating contiuations on the heap. But my guess is that it won't be necessary.

like image 145
Norman Ramsey Avatar answered Oct 27 '22 00:10

Norman Ramsey


The expression

merge (sort left) (sort right)

is not tail-recursive; it calls both (sort left) and (sort right) recursively while there is remaining work in the call (merge).

I think you can fix it with an extra continuation parameter:

  let rec sort l k =
    match l with
    | [] -> k [] 
    | [a] -> k [a] 
    | list -> let left, right = split list in sort left (fun leftR -> sort right (fun rightR -> k (merge leftR rightR)))
  in sort l (fun x -> x)
like image 31
Brian Avatar answered Oct 27 '22 00:10

Brian