Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does (Array/List/Seq).groupBy maintain sort order within groups?

Tags:

f#

Does groupBy guarantee that sort order is preserved in code like the following?

x
|> Seq.sortBy (fun (x, y) -> y)
|> Seq.groupBy (fun (x, y) -> x)

By preserving sort order, I mean can we guarantee that within each grouping by x, the result is still sorted by y.

This is true for simple examples,

[(1, 3);(2, 1);(1, 1);(2, 3)]
|> Seq.sortBy (fun (x, y) -> y)
|> Seq.groupBy (fun (x, y) -> x)
// seq [(2, seq [(2, 1); (2, 3)]); (1, seq [(1, 1); (1, 3)])]

I want to make sure there are no weird edge cases.

like image 840
nh2 Avatar asked Aug 31 '25 22:08

nh2


2 Answers

What do you mean by preserving sort order? Seq.groupBy changes the type of the sequence, so how can you even meaningfully compare before and after?

For a given xs of the type seq<'a * 'b>, the type of the expression xs |> Seq.sortBy snd is seq<'a * 'b>, whereas the type of the expression xs |> Seq.sortBy snd |> Seq.groupBy fst is seq<'a * seq<'a * 'b>>. Thus, whether or not the answer to the question is yes or no depends on what you mean by preserving the sort order.

As @Petr wrote in the comments, it's easy to test this. If you're worried about special cases, write a Property using FsCheck and see if it generalises:

open FsCheck.Xunit
open Swensen.Unquote

[<Property>]
let isSortOrderPreserved (xs : (int * int) list) =
    let actual = xs |> Seq.sortBy snd |> Seq.groupBy fst

    let expected = xs |> Seq.sortBy snd |> Seq.toList
    expected =! (actual |> Seq.map snd |> Seq.concat |> Seq.toList)

In this property, I've interpreted the property of sort order preservation to mean that if you subsequently concatenate the grouped sequences, the sort order is preserved. Your definition may be different.

Given this particular definition, however, running the property clearly demonstrates that the property doesn't hold:

Falsifiable, after 6 tests (13 shrinks) (StdGen (1448745695,296088811)):
Original:
[(-3, -7); (4, -7); (4, 0); (-4, 0); (-4, 7); (3, 7); (3, -1); (-5, -1)]
Shrunk:
[(3, 1); (3, 0); (0, 0)]

---- Swensen.Unquote.AssertionFailedException : Test failed:

[(3, 0); (0, 0); (3, 1)] = [(3, 0); (3, 1); (0, 0)]
false

Here we see that if the input is [(3, 1); (3, 0); (0, 0)], the grouped sequence doesn't preserve the sort order (which isn't surprising to me).


Based on the updated question, here's a property that examines that question:

[<Property(MaxTest = 10000)>]
let isSortOrderPreservedWithEachGroup (xs : (int * int) list) =
    let actual = xs |> Seq.sortBy snd |> Seq.groupBy fst

    let expected =
        actual
        |> Seq.map (fun (k, vals) -> k, vals |> Seq.sort |> Seq.toList)
        |> Seq.toList
    expected =!
        (actual |> Seq.map (fun (k, vals) -> k, Seq.toList vals) |> Seq.toList)

This property does, indeed, hold:

Ok, passed 10000 tests.

You should still consider carefully whether you want to rely on behaviour that isn't documented, since it could change in later incarnations of F#. Personally, I'd adopt a piece of advice from the Zen of Python:

Explicit is better than implicit.

BTW, the reason for all that conversion to F# lists is because lists have structural equality, while sequences don't.

like image 99
Mark Seemann Avatar answered Sep 04 '25 23:09

Mark Seemann


Who cares. Instead of sorting and then grouping, just group and then sort and the ordering is guaranteed even if the F# implementation of groupBy eventually changes:

x |> Seq.groupBy (fun (x, y) -> x) |> Seq.map (fun (k, v) -> k, v |> Seq.sortBy (fun (x, y) -> y))

like image 36
nh2 Avatar answered Sep 05 '25 00:09

nh2