Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge sorted sequences?

Tags:

f#

Here’s a problem I’ve really been struggling with. I need to merge two sorted sequences into a single sorted sequence. Ideally, the algorithm should be lazy-evaluated, and not require caching more than one item from each sequence. This is not a terribly difficult problem to solve, and I’ve been able to engineer a number of solutions in F#. Unfortunately, every solution I’ve come up with has one of several problems.

  1. Recursive calls to subsequence generators using yield!. This produces elegant looking solutions, but the creation of a subsequence for every item is a performance killer.

  2. Really arcane and unmaintainable code with deeply-stacked match switches, multiple nearly identical blocks of code, etc.

  3. Code which forces F# into a purely procedural mode (lots of mutable values, etc.).

And all of the online examples I've been able to find founder on the same shoals.

Am I missing something obvious: like it's either really simple or else obviously impossible? Does anyone know of a really elegant solution that is also efficient and mostly functional? (It doesn’t have to be purely functional.) If not, I may end up caching subsequences and using lists or arrays.

like image 615
TechNeilogy Avatar asked Aug 14 '10 17:08

TechNeilogy


People also ask

How do I merge two sorted Arraylists?

Step 1 : Let arrayA and arrayB be two sorted integer input arrays. Step 2 : Declare mergedArray with combined size of arrayA and arrayB . Step 4 : Smaller element in arrayA[i] and arrayB[j] is assigned to mergedArray[k] .

Can you merge sort a list?

function will sort a linked list using the merge sort algorithm. to cut the list from the middle node into two halves. Eventually, we sort each part separately, then we'll merge them to get a single sorted list.


1 Answers

Ideally, the algorithm should be lazy-evaluate... the creation of a subsequence for every item is a performance killer

Lazy means slow but here is a solution using lazy lists:

let (++) = LazyList.consDelayed

let rec merge xs ys () =
  match xs, ys with
  | Cons(x, xs'), Cons(y, _) when x<y -> x ++ merge xs' ys
  | Cons(x, _), Cons(y, ys') -> y ++ merge xs ys'
  | Nil, xs | xs, Nil -> xs

I think by "lazy evaluated" you mean you want the merged result to be generated on demand so you can also use:

let rec merge xs ys = seq {
  match xs, ys with
  | x::xs, y::_ when x<y ->
      yield x
      yield! merge xs ys
  | x::_, y::ys ->
      yield y
      yield! merge xs ys
  | [], xs | xs, [] -> yield! xs
}

As you say, this is very inefficient. However, a seq-based solution doesn't have to be slow. Here, Seq.unfold is your friend and can make this over 4× faster by my measurements:

let merge xs ys =
  let rec gen = function
    | x::xs, (y::_ as ys) when x<y -> Some(x, (xs, ys))
    | xs, y::ys -> Some(y, (xs, ys))
    | [], x::xs | x::xs, [] -> Some(x, ([], xs))
    | [], [] | [], [] -> None
  Seq.unfold gen (xs, ys)
like image 90
J D Avatar answered Sep 25 '22 10:09

J D