groupBy-like function such that the binary predicate holds between consecutive elements of each group instead of any two

Question

On Hackage I see that groupBy's implementation is this:

groupBy                 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _  []           =  []
groupBy eq (x:xs)       =  (x:ys) : groupBy eq zs
                           where (ys,zs) = span (eq x) xs

which means that the predicate eq holds between any two elements of each group. Examples:

> difference_eq_1 = ((==1).) . flip (-)
> first_isnt_newline = ((/= '
').) . const
>
> Data.List.groupBy difference_eq_1 ([1..10] ++ [11,13..21])
[[1,2],[3,4],[5,6],[7,8],[9,10],[11],[13],[15],[17],[19],[21]]
>
> Data.List.groupBy first_isnt_newline "uno
due
tre"
["uno
due
tre"]

What if instead I want to group elements such that the predicate holds between any pair of consecutive elements, so that the above results would be as follows?

[[1,2,3,4,5,6,7,8,9,10,11],[13],[15],[17],[19],[21]]
["uno
","due
","tre"]

I wrote it myself, and it looks a bit ugly

groupBy' :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy' p = foldr step []
  where step elem [] = [[elem]]
        step elem gs'@((g'@(prev:g)):gs)
          | elem `p` prev = (elem:g'):gs
          | otherwise = [elem]:gs'

So I was wandering if such a function exists already and I just don't find it.

As regards the second usage, Data.List.groupBy first_isnt_newline, where the binary predicate basically ignores the second argument and applies a unary predicate to the first, I've just found that Data.List.HT.segmentAfter unary_predicate does the job, where unary_predicate is the negation of the unary predicate in which const's output is forwarded. In other words Data.List.groupBy ((/= ' ').) . const === Data.List.HT.segmentAfter (==' ').

Jon Purdy · Accepted Answer

There is a groupBy package that does exactly that.

But here’s another way of implementing it:

Zip the list with its tail to test the predicate on adjacent elements
Generate a “group index” by scanning the result and incrementing the group whenever the predicate is false
Group by the index
Remove the indices

groupByAdjacent :: (a -> a -> Bool) -> [a] -> [[a]]
groupByAdjacent p xs
  = fmap (fmap fst)
  $ groupBy ((==) `on` snd)
  $ zip xs
  $ scanl' (\ g (a, b) -> if p a b then g else succ g) 0
  $ zip xs
  $ drop 1 xs

For an input like [1, 2, 3, 10, 11, 20, 30], the predicate will return [True, True, False, True, False, False] and the resulting group indices will be [0, 0, 0, 1, 1, 2, 3].

The scan can also be written pointfree as scanr (bool succ id . uncurry p) 0, since the scan direction doesn’t matter (although the group indices will be reversed). The group index might be handy or just more readable to keep as an integer, but it could be a Bool instead, because the minimum size of a group is 1: the functional argument of the scan would be bool not id . uncurry p, which can be simplified to (==) . uncurry p. And several of these parts could be factored into reusable functions, like zipNext = zip <*> drop 1, but I’ve inlined them for simplicity’s sake.

groupBy-like function such that the binary predicate holds between consecutive elements of each group instead of any two

Tags:

functional-programming

haskell

grouping

Enlico

1 Answers

Jon Purdy

Recent Activity

Donate For Us

groupBy-like function such that the binary predicate holds between consecutive elements of each group instead of any two

Tags:

functional-programming

haskell

grouping

Enlico

1 Answers

Jon Purdy

Related questions

Recent Activity

Donate For Us