Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most idiomatic way to write batchesOf size seq in F#

I'm trying to learn F# by rewriting some C# algorithms I have into idiomatic F#.

One of the first functions I'm trying to rewrite is a batchesOf where:

[1..17] |> batchesOf 5

Which would split the sequence into batches with a max of five in each, i.e:

[[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]

My first attempt at doing this is kind of ugly where I've resorted to using a mutable ref object after running into errors trying to use mutable type inside the closure. Using ref is particularly unpleasant since to dereference it you have to use the ! operator which when inside a condition expression can be counter intuitive to some devs who will read it as logical not. Another problem I ran into is where Seq.skip and Seq.take are not like their Linq aliases in that they will throw an error if size exceeds the size of the sequence.

let batchesOf size (sequence: _ seq) : _ list seq =
    seq {
        let s = ref sequence
        while not (!s |> Seq.isEmpty)  do
            yield !s |> Seq.truncate size |> List.ofSeq
            s := System.Linq.Enumerable.Skip(!s, size)
    }

Anyway what would be the most elegant/idiomatic way to rewrite this in F#? Keeping the original behaviour but preferably without the ref mutable variable.

like image 378
mythz Avatar asked Sep 22 '11 05:09

mythz


People also ask

What is seq in F#?

Seq. groupBy takes a sequence and a function that generates a key from an element. The function is executed on each element of the sequence. Seq. groupBy returns a sequence of tuples, where the first element of each tuple is the key and the second is a sequence of elements that produce that key.

What is yield in F#?

F# Sequence Workflows yield and yield! (pronounced yield bang) inserts all the items of another sequence into this sequence being built. Or, in other words, it appends a sequence. (In relation to monads, it is bind .)

What is the underlying difference between a sequence and a list in F #?

The list is created on declaration, but elements in the sequence are created as they are needed. As a result, sequences are able to represent a data structure with an arbitrary number of elements: > seq { 1I ..


2 Answers

Implementing this function using the seq<_> type idiomatically is difficult - the type is inherently mutable, so there is no simple nice functional way. Your version is quite inefficient, because it uses Skip repeatedly on the sequence. A better imperative option would be to use GetEnumerator and just iterate over elements using IEnumerator. You can find various imperative options in this snippet: http://fssnip.net/1o

If you're learning F#, then it is better to try writing the function using F# list type. This way, you can use idiomatic functional style. Then you can write batchesOf using pattern matching with recursion and accumulator argument like this:

let batchesOf size input = 
  // Inner function that does the actual work.
  // 'input' is the remaining part of the list, 'num' is the number of elements
  // in a current batch, which is stored in 'batch'. Finally, 'acc' is a list of
  // batches (in a reverse order)
  let rec loop input num batch acc =
    match input with
    | [] -> 
        // We've reached the end - add current batch to the list of all
        // batches if it is not empty and return batch (in the right order)
        if batch <> [] then (List.rev batch)::acc else acc
        |> List.rev
    | x::xs when num = size - 1 ->
        // We've reached the end of the batch - add the last element
        // and add batch to the list of batches.
        loop xs 0 [] ((List.rev (x::batch))::acc)
    | x::xs ->
        // Take one element from the input and add it to the current batch
        loop xs (num + 1) (x::batch) acc
  loop input 0 [] []

As a footnote, the imperative version can be made a bit nicer using computation expression for working with IEnumerator, but that's not standard and it is quite advanced trick (for example, see http://fssnip.net/37).

like image 128
Tomas Petricek Avatar answered Nov 15 '22 23:11

Tomas Petricek


A friend asked me this a while back. Here's a recycled answer. This works and is pure:

let batchesOf n =
    Seq.mapi (fun i v -> i / n, v) >>
    Seq.groupBy fst >>
    Seq.map snd >>
    Seq.map (Seq.map snd)

Or an impure version:

let batchesOf n =
    let i = ref -1
    Seq.groupBy (fun _ -> i := !i + 1; !i / n) >> Seq.map snd

These produce a seq<seq<'a>>. If you really must have an 'a list list as in your sample then just add ... |> Seq.map (List.ofSeq) |> List.ofSeq as in:

> [1..17] |> batchesOf 5 |> Seq.map (List.ofSeq) |> List.ofSeq;;
val it : int list list = [[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]

Hope that helps!

like image 41
AshleyF Avatar answered Nov 15 '22 22:11

AshleyF