Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

designing data type with lots of constructor in haskell

Tags:

haskell

is there a alternative way to write something like this:

data Message = Message1 Int Int ByteString
             | Message2 Double Int Int
             | Message3 Double Double
             .....
             | Message256 CustomType

There's way too many constructors, and it's difficult to use record syntax. What I really want to do is to write a parser, is there some alternative approaches for this?

parse :: Bytestring -> Parser Message
like image 649
Kai Avatar asked Nov 03 '22 17:11

Kai


1 Answers

First off, there is a Plan to implement overloaded record fields for Haskell, which would allow to use the same name in different records, and you would only have to explicitly specify which one you want in the cases where the compiler can't figure it out by itself.

That being said ...


I found the most reliable and convenient way to deal with this is one Haskell type per message type.

You would have:

data Message1 = Message1 Int Int ByteString -- can use records here
data Message2 = Message2 Double Int Int
data Message3 = Message3 { m3_a :: Double, m3_b :: Double }
--          .....
data Message256 = Message256 CustomType

-- A sum type over all possible message types:
data AnyMessage = M1   Message1
                | M2   Message2
                | M3   Message3
                -- ...
                | M256 Message256

Benefits of this include:

  • You can use records (still have to use different prefixes, but that is often fine enough)
  • It is much safer than sharing records across constructors:

    data T = A { field :: Int }
           | B { field :: Int }
           | C { bla :: Double } -- no field record
    
    print (field (C 2.3)) -- will crash at runtime, no compiler warning
    
  • You can now write functions that only work on certain message types.

  • You can now write functions that only work on a subset (e.g. 3 of them) of message types: all you need is another sum type.
  • Code dealing with this is still quite elegant:

    process :: AnyMessage -> IO ()
    process anyMsg = case anyMsg of
        M1 (Message1 x y bs) -> ...
        ...
        M3 Message3{ m3_a, m3_b } -> ... -- using NamedFieldPuns
    

I have used this pattern multiple times in production, and it leads to very robust code.

like image 157
nh2 Avatar answered Nov 15 '22 08:11

nh2