Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lazily grouping a flat sequence in F#

Given a sequence of items as follows:

[ ("a", 1); ("a", 2); ("a", 3); ("b", 1); ("c", 2); ("c", 3) ]

How can I convert this lazily into:

{ ("a", { 1; 2; 3}); ("b", { 1 }); ("c", { 2; 3}) }

You can assume that the input data source is already sorted on the grouping key element e.g. "a" "b" and "c".

I'm using the { } there to indicate that it's a lazily-evaluated sequence of items.

I've gotten it working imperatively with two while loops operating over the IEnumerator of the source sequence, but this involves creating reference variables and mutation etc. etc. I'm sure that there are better ways of doing this, perhaps with Recursion or using some of the operations in the Seq library e.g. scan or unfold?

like image 483
Isaac Abraham Avatar asked Dec 16 '22 01:12

Isaac Abraham


1 Answers

If you want to implement this over IEnumerable<'T> (to make it lazy), then it is necessarily going to be somewhat imperative, because the IEnumerator<'T> type that is used to iterate over the input is imperative. But the rest can be written as a recursive function using sequence expressions.

The following is lazy in the first level (it produces each group lazily), but it does not produce elements of the group lazily (I think that would have pretty subtle semantics):

/// Group adjacent elements of 'input' according to the 
/// keys produced by the key selector function 'f'
let groupAdjacent f (input:seq<_>) = seq {
  use en = input.GetEnumerator()

  // Iterate over elements and keep the key of the current group
  // together with all the elements belonging to the group so far
  let rec loop key acc = seq { 
    if en.MoveNext() then 
      let nkey = f en.Current 
      if nkey = key then 
        // If the key matches, append to the group so far
        yield! loop key (en.Current::acc)
      else 
        // Otherwise, produce the group collected so far & start a new one
        yield List.rev acc
        yield! loop nkey [en.Current]
    else
      // At the end of the sequence, produce the last group
      yield List.rev acc
  }
  // Start with the first key & first value as the accumulator
  if en.MoveNext() then 
    yield! loop (f en.Current) [en.Current] }

Unfortunately, this (pretty useful!) function is not included in the standard F# library, so if you want to group adjacent elements (rather than arbitrary elements in the list using Seq.groupBy), you have to define it yourself...

like image 164
Tomas Petricek Avatar answered Feb 08 '23 10:02

Tomas Petricek