If I have a parser a : Parser A
and a parser b : Parser B
then I can combine it into a parser a | b : Parser (Either A B)
. This works but gets a little tricky when you start adding more alternatives and getting types like Either A (Either B C)
. I can imagine flattening the previous type into something like Alternative A B C
. Is there a standard transformation I can perform or am I stuck with generating a whole bunch of boilerplate for types like Alternative A B C ...
.
So the interesting thing about Either
is that you can use it as a type-level cons
operator.
A `Either` (B `Either` (C `Either` (D `Either` Void))) --> [A,B,C,D]
So all we need do is make that explicit. You'll need ghc-7.8 to support closed data families:
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE DataKinds #-}
-- ...
type family OneOf (as :: [*]) :: * where
OneOf '[a] = a
OneOf (a ': as) = Either a (OneOf as)
Now you can write your types much more succinctly:
aorborc :: Parser (OneOf '[A, B, C])
aorborc = a | (b | c)
It's still Either
under the hood, so you can still easily interoperate with all existing code that uses Either
, which is nice.
Either is just one possible sum type in Haskell, and because of the ready made class instances and helper functions is useful for many cases, but becomes considerably clunkier when you nest it.
The best approach for a parser is to create your own data type that mirrors the structure you're parsing and parse directly into that. Let's make a partial toy example about a toy language.
data Statement = TypeDec String Type
DataDec String [Constructor]
FunctionDec String LambdaExpression
statement :: Parser Statement
statement = TypeDec <$> string "type " *> identifier <*> string " = " *> type
<|> DataDec <$> string "data " *> identifier <*> string " = " *> many constructor
<|> FunctionDec <$> identifier <*> string " = " *> lambdaExpression
In this way, both your data structure and your code mirror the productions in the grammar you're parsing. The great benefit to that is that your data is type safe, clear and ready to use as soon as it's parsed.
(I can never remember the fixities of *>
and <*
, so I've probably done it the way you need brackets or something, but hopefully you get the idea.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With