I'm looking for a way to transform a list into an n-tuple with one list for each of the n constructors in a disjoint union. The standard library defines a similar function specifically for Either
s:
partitionEithers :: [Either a b] -> ([a], [b])
I'm looking for techniques for solving the generalized problem with the following requirements:
Here is an example specification with two proposed solutions:
partitionSum :: [MySum] -> ([A], [B], [C], [D])
data MySum
= CaseA A
| CaseB B
| CaseC C
| CaseD D
data A = A deriving Show
data B = B deriving Show
data C = C deriving Show
data D = D deriving Show
-- expect "([A,A],[B,B,B],[],[D])"
test :: IO ()
test = print . partitionSum $
[CaseD D, CaseB B, CaseA A, CaseA A, CaseB B, CaseB B]
First attempt: n list comprehensions that traverse the list n times.
partitionSum1 :: [MySum] -> ([A], [B], [C], [D])
partitionSum1 xs =
( [a | CaseA a <- xs]
, [b | CaseB b <- xs]
, [c | CaseC c <- xs]
, [d | CaseD d <- xs]
)
Second attempt: a single traversal of the input list. I have to manually thread the state through the fold which makes the solution a little repetitive and annoying to write.
partitionSum2 :: [MySum] -> ([A], [B], [C], [D])
partitionSum2 = foldr f ([], [], [], [])
where
f x (as, bs, cs, ds) =
case x of
CaseA a -> (a : as, bs, cs, ds)
CaseB b -> (as, b : bs, cs, ds)
CaseC c -> (as, bs, c : cs, ds)
CaseD d -> (as, bs, cs, d : ds)
This one lists all of the vowels in the sentence, in the order they're seen within it: In Haskell, lists are homogeneous -- they can only store one kind of value ( Num, Bool, Char, etc.). If you want to store heterogeneous values, you need to use a tuple (created using parentheses):
For instance, check out this paper written by three of the designers of Haskell in 1999.) List comprehensions have an output function, one or more input sets, and one or more predicates, in that order. A basic list comprehension looks like: ghci> [ <output function> | <input set>, ..., <predicate>, ... ]
Haskell classes (also called typeclasses) are sort of like Java interfaces in that any child class derived from a particular parent class is guaranteed to implement some specific behaviour. Classes which implement Eq can be tested for equality.
...and so on. Double, Float, Int, Integer, and other predefined types also exist in Haskell, but -- as type inference gives the variable the widest possible scope (usually Num or Fractional for numbers) -- you have to explicitly declare a variable as one of these narrower types: What's the difference between Int and Integer, though?
In addition to the Representable
answer:
A thing that came to me from seeing foldr f ([], [], [], [])
was to define a monoid where the nil case is mempty
{-# DerivingVia #-}
..
import GHC.Generics (Generically(..), ..)
type Classify :: Type
type Classify = C [A] [B] [C] [D]
deriving
stock Generic
deriving (Semigroup, Monoid)
via Generically Classify
-- mempty = C [] [] [] []
-- C as bs cs ds <> C as1 bs1 cd1 ds1 = C (as ++ as1) (bs ++ bs1) (cs ++ cs1) (ds ++ ds1)
Generically
will be exported from GHC.Generics
in the future. It defines Classify
as a semigroup and monoid through generic pointwise lifting.
With this all you need is a classifier function, that classifies a MySum
into Classify
and you can define partition
in terms of foldMap
classify :: MySum -> Classify
classify = \case
SumA a -> C [a] [] [] []
SumB b -> C [] [b] [] []
SumC c -> C [] [] [c] []
SumD d -> C [] [] [] [d]
partition :: Foldable f => f MySum -> Classify
partition = foldMap classify
As your function is a transformation from sums to products, there's a fairly simple implementation using generics-sop
. This is a library which enhances GHCs generics with more specialized types that make induction on algebriac type (i.e. sums of products) simpler.
First, a prelude:
{-# LANGUAGE DeriveGeneric, StandaloneDeriving #-}
import Generics.SOP hiding ((:.:))
import qualified GHC.Generics as GHC
import GHC.Generics ((:.:)(..))
partitionSum :: (Generic t) => [t] -> NP ([] :.: NP I) (Code t)
This is the method you want to write. Let's examine its type.
Generic
is the one from generics-sop
, not from GHCNP I
(itself an n-ary product, because generally, algebraic datatype constructors might have more than one field)Code t
is the sum-of-products type representation of t
. It's a list of lists of type. e.g. Code (Either a b) ~ '[ '[a], '[b] ]
. The generic value representation of t
is SOP I (Code t)
- a sum of of products over the "code".To implement this, we can convert each t
to its generic representation, then fold over the resulting list:
partitionSum = partitionSumGeneric . map from
partitionSumGeneric :: SListI xss => [SOP I xss] -> NP ([] :.: NP I) xss
partitionSumGeneric = foldr (\(SOP x) -> classifyGeneric x) emptyClassifier
partitionSumGeneric
is pretty much the same as partitionSum
, but operates on generic representations of values.
Now for the interesting part. Let's begin with the base case of our fold. This should contain empty lists in every position. generics-sop
provides a handy mechanism for generating a product type with a uniform value in each position:
emptyClassifier :: SListI xs => NP ([] :.: NP I) xs
emptyClassifier = hpure (Comp1 [])
The recursive case is as follows: if the value has tag at index k
, add that value to the list at index k
in the accumulator. We can do this with simultaneous recursion on both the sum type (it's generic now, so a value of type NS (NP I) xs
- a sum of products) and on the accumulator.
classifyGeneric :: NS (NP I) xss -> NP ([] :.: NP I) xss -> NP ([] :.: NP I) xss
classifyGeneric (Z x) (Comp1 l :* ls) = (Comp1 $ x : l) :* ls
classifyGeneric (S xs) ( l :* ls) = l :* classifyGeneric xs ls
Your example with some added data to make it a bit more interesting:
data MySum
= CaseA A
| CaseB B
| CaseC C
| CaseD D
-- All that's needed for `partitionSum' to work with your type
deriving instance GHC.Generic MySum
instance Generic MySum
data A = A Int deriving Show
data B = B String Int deriving Show
data C = C deriving Show
data D = D Integer deriving Show
test = partitionSum $
[CaseD $ D 0, CaseB $ B "x" 1, CaseA $ A 2, CaseA $ A 3, CaseB $ B "y" 4, CaseB $ B "z" 5]
the result is:
Comp1 {unComp1 = [I (A 2) :* Nil,I (A 3) :* Nil]} :* Comp1 {unComp1 = [I (B "x" 1) :* Nil,I (B "y" 4) :* Nil,I (B "z" 5) :* Nil]} :* Comp1 {unComp1 = []} :* Comp1 {unComp1 = [I (D 0) :* Nil]} :*Nil
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With