I'm trying to write a basic lexer using Haskell. For implementing the DFA and NFA, I decided to put some common functions into classes FA and FAState.
-- |A class for defining the common functionality of all finite automatons.
class FA a b where
mutateId :: a -> Int -> a -- ^Returns a new FA by changing the sId of the input FA.
mutateType :: a -> StateType -> a -- ^Returns a new FA by changing the stateType of the input FA.
addTransition :: a -> (b, a) -> a -- ^Returns a new FA by adding a new transition to the input FA.
-- |A class for defining the common functionality of all finite automaton states.
class FA a b => FAState a b where
sId :: a -> Int -- ^An unique identifier for the state(hence the prefix s).
sType :: a -> StateType -- ^The type of the state.
sTransitions :: a -> Transitions b a -- ^The transitions that occur from this state.
where,
-- |A type which identifies different types of a FA state.
data StateType = Start | Normal | Final
deriving (Show, Read, Eq)
-- |A type which represents a list of transitions on input a to b.
-- Eg. [(Char, DFA)] represents the transition on a Char input.
type Transitions a b = [(a, b)]
Hence, b represents the data type for which transitions occur. For a DFA, b = Char, while for a NFA, b = Symbol.
data Symbol = Alphabet Char | Epsilon
deriving (Show, Read, Eq)
DFA and NFA are respectively defined as:
data DFA = DState Int StateType (Transitions Char DFA)
deriving (Show, Read)
data NFA = NState Int StateType (Transitions Symbol NFA)
deriving (Show, Read)
I'm having a problem with the instance definitions of FA & FAState:
instance FA DFA Char where
mutateId (DState i ty ts) new_i = DState new_i ty ts
mutateType (DState i ty ts) new_ty = DState i new_ty ts
addTransition (DState i ty ts) state = DState i ty (state:ts)
instance FAState DFA Char where
sId (DState i t ts) = i
sType (DState i t ts) = t
sTransitions (DState i t ts) = ts
instance FA NFA Symbol where
mutateId (NState i ty ts) new_i = NState new_i ty ts
mutateType (NState i ty ts) new_ty = NState i new_ty ts
addTransition (NState i ty ts) state = NState i ty (state:ts)
instance FAState NFA Symbol where
sId (NState i t ts) = i
sType (NState i t ts) = t
sTransitions (NState i t ts) = ts
On trying to run any of the functions I get a no instance error:
>>sId egNFA
<interactive>:15:1:
No instance for (FAState NFA b0)
arising from a use of `sId'
Possible fix: add an instance declaration for (FAState NFA b0)
In the expression: sId egNFA
In an equation for `it': it = sId egNFA
I don't understand whats happening here.
The root of your problem is this: instance dispatch will never make an inferred type more specific, even if that would allow it to choose an instance. This design decision is related to the so-called "open-world" model of classes: the goal is that code's behavior (including "whether it compiles") should never change just by adding instances of a typeclass.
Now, keeping that principle in mind, think about what you've asked the compiler to do: you've given an instance of FAState NFA Symbol
, and written an expression which is typeclass polymorphic and only fixes the first type to NFA
; the other is left completely open. The compiler could choose the single instance that's in scope (where the other variable is monomorphed to Symbol
), but this would violate our design principle: now adding an instance for (say) FAState NFA Widget
would result in ambiguous code, turning working, compilable code into non-compilable code. So the compiler conservatively refuses to compile even the version with only one instance in scope.
There's a few standard fixes:
Give a type signature to fix the other type, telling the compiler which instance to choose. Unfortunately, this solution won't work for you: your typeclass-polymorphic value sId :: FAState a b => a -> Int
doesn't mention both type variables a
and b
in its type. Since you can never use this value, I think modern GHC's will reject this type class a bit earlier (before you even write any instances or try to invoke the class methods), though I haven't got one lying around to test at the moment.
Just to give an example of this solution, consider sTransitions
instead of sId
: here the type signature does mention both variables, so you can turn the non-compiling sTransitions nfa
into the yes-compiling sTransitions nfa :: Transitions Symbol NFA
. (The more principled, generalizable transformation gives a type signature on the method only; for example, you can readily generalize the translation from sTransitions nfa
to (sTransitions :: NFA -> Transitions Symbol NFA) dfa
.)
Use function dependencies. The idea here is that the type of the state is completely determined by the type of the automaton, so morally it should be enough to just fix the first type variable in the class. The syntax that tells GHC about this fact looks like this:
class FAState a b | a -> b where
{- ...same as before -}
-- instance declarations look the same as before, too
This does two things: first, it tells GHC that if it knows a
, it can use this to choose an instance even if it doesn't yet know b
, and second, it tells GHC to double-check that no pair of instances of the class violate the functionality constraint, that is, that no two instances have the same a
but different b
.
Use (associated) type families. This is the same idea as the previous, but expressed in perhaps a more familiar paradigm. The syntax for this one looks like this:
class FAState a where
type State a
sId :: a -> Int
sType :: a -> StateType
sTransitions :: a -> Transitions (State a) a
instance FAState NFA where
type State NFA = Symbol
-- methods are the same as before
This introduces a new type constructor named State
(which you can use in type signatures and so forth). You can think of it as a function at the type level that takes as inputs types which are instances of the class FAState
and outputs the type of the state associated with that type of automaton.
Make your instance declarations more polymorphic. If GHC is complaining that it doesn't know enough about the second variable, well... you can always tell it that all instantiations of the second variable are equally good (up to some equality constraints). For example:
-- class definition same as before
instance b ~ Symbol => FAState NFA b where
-- methods same as before
The ~
is GHC's notation for type-level equality. The way this works is quite sneaky, and is well described elsewhere (I'll dig up some links if you really want them), but the short version of the explanation is that this tells GHC that if it knows enough to pick NFA
as the first variable, then it can commit to this instance straight away, and only after it's committed does it double-check that the second argument is actually Symbol
.
Enjoy!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With