Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's a better way of managing large Haskell records?

Replacing fields names with letters, I have cases like this:

data Foo = Foo { a :: Maybe ...
               , b :: [...]
               , c :: Maybe ...
               , ... for a lot more fields ...
               } deriving (Show, Eq, Ord)

instance Writer Foo where
  write x = maybeWrite a ++
            listWrite  b ++
            maybeWrite c ++
            ... for a lot more fields ...

parser = permute (Foo
                   <$?> (Nothing, Just `liftM` aParser)
                   <|?> ([], bParser)
                   <|?> (Nothing, Just `liftM` cParser)
                   ... for a lot more fields ...

-- this is particularly hideous
foldl1 merge [foo1, foo2, ...]
merge (Foo a b c ...seriously a lot more...)
      (Foo a' b' c' ...) = 
        Foo (max a a') (b ++ b') (max c c') ...

What techniques would allow me to better manage this growth?

In a perfect world a, b, and c would all be the same type so I could keep them in a list, but they can be many different types. I'm particularly interested in any way to fold the records without needing the massive patterns.

I'm using this large record to hold the different types resulting from permutation parsing the vCard format.

Update

I've implemented both the generics and the foldl approaches suggested below. They both work, and they both reduce three large field lists to one.

like image 704
Robert Campbell Avatar asked Mar 12 '23 18:03

Robert Campbell


1 Answers

Datatype-generic programming techniques can be used to transform all the fields of a record in some "uniform" sort of way.

Perhaps all the fields in the record implement some typeclass that we want to use (the typical example is Show). Or perhaps we have another record of "similar" shape that contains functions, and we want to apply each function to the corresponding field of the original record.

For these kinds of uses, the generics-sop library is a good option. It expands the default Generics functionality of GHC with extra type-level machinery that provides analogues of functions like sequence or ap, but which work over all the fields of a record.

Using generics-sop, I tried to create a slightly less verbose version of your merge funtion. Some preliminary imports:

{-# language TypeOperators #-}
{-# language DeriveGeneric #-}
{-# language TypeFamilies #-}
{-# language DataKinds #-}

import Control.Applicative (liftA2)
import qualified GHC.Generics as GHC
import Generics.SOP

A helper function that lifts a binary operation to a form useable by the functions of generics-sop:

fn_2' :: (a -> a -> a) -> (I -.-> (I -.-> I)) a -- I is simply an Identity functor
fn_2' = fn_2 . liftA2

A general merge function that takes a vector of operators and works on any single-constructor record that derives Generic:

merge :: (Generic a, Code a ~ '[ xs ]) => NP (I -.-> (I -.-> I)) xs -> a -> a -> a 
merge funcs reg1 reg2 =
    case (from reg1, from reg2) of 
        (SOP (Z np1), SOP (Z np2)) -> 
            let npResult  = funcs `hap` np1 `hap` np2
            in  to (SOP (Z npResult))

Code is a type family that returns a type-level list of lists describing the structure of a datatype. The outer list is for constructors, the inner lists contain the types of the fields for each constructor.

The Code a ~ '[ xs ] part of the constraint says "the datatype can only have one constructor" by requiring the outer list to have exactly one element.

The (SOP (Z _) pattern matches extract the (heterogeneus) vector of field values from the record's generic representation. SOP stands for "sum-of-products".

A concrete example:

data Person = Person
    {
        name :: String
    ,   age :: Int
    } deriving (Show,GHC.Generic)

instance Generic Person -- this Generic is from generics-sop

mergePerson :: Person -> Person -> Person
mergePerson = merge (fn_2' (++) :* fn_2' (+) :* Nil)

The Nil and :* constructors are used to build the vector of operators (the type is called NP, from n-ary product). If the vector doesn't match the number of fields in the record, the program won't compile.

Update. Given that the types in your record are highly uniform, an alternative way of creating the vector of operations is to define instances of an auxiliary typeclass for each field type, and then use the hcpure function:

class Mergeable a where
    mergeFunc :: a -> a -> a

instance Mergeable String where
    mergeFunc = (++)

instance Mergeable Int where
    mergeFunc = (+)

mergePerson :: Person -> Person -> Person
mergePerson = merge (hcpure (Proxy :: Proxy Mergeable) (fn_2' mergeFunc))

The hcliftA2 function (that combines hcpure, fn_2 and hap) could be used to simplify things further.

like image 125
danidiaz Avatar answered Mar 20 '23 17:03

danidiaz