So I have been working on a simple expression solver in Haskell. I have been trying to refactor some of my code from do notation to applicative code, mainly because I am want to learn how applicatives work. I am lonst on how to reactor this
factor :: Parser Expr
factor = do
char '('
x <- buildExpr
char ')'
return x
<|> number
<|> variables
<?> "simple expression"
What would be the way to make this into an applicative style? I have tried the following but it wont type check
factor = pure buildExpr <$> (char '(' *> buildExpr *> char ')')
where buildExper has type Parser Expr.
Short answer:
factor = (char '(' *> buildExpr <* char ')') <|> number <|> variables
<?> "simple expression"
Long answer:
<$>
has this type:
(<$>) :: (Functor f) => (a -> b) -> f a -> f b
In other words, it takes a function and a value of a type that is an instance of Functor
(and returns something we don’t care about at the moment). Unfortunately, you aren’t giving it a function as the first argument; you’re giving it pure buildExpr
, which is a Parser
that, when executed, consumes no input and yields buildExpr
. If you really wanted to do that, you could, with <*>
:
factor = pure buildExpr <$> (char '(' *> buildExpr *> char ')')
That would run pure buildExpr
, extract the function out of it, and then run that on the result of (char '(' *> buildExpr *> char ')')
. But unfortunately, we can’t do that either: buildExpr
is a Parser
of some sort, not a function.
If you think about it enough, the thought should pass through your mind: why are we mentioning buildExpr
twice if we only want to parse one? It turns out that it is sufficient to mention it only once. In fact, this probably does almost what you want:
factor = char '(' *> buildExpr *> char ')'
Only problem is, it will yield the Char
)
, not the result of buildExpr
. Darn! But looking through the documentation and matching up the types, you should eventually be able to figure out that if you replace the second *>
with a <*
, it’ll all work out as you want it to:
factor = char '(' *> buildExpr <* char ')'
A good mnemonic for this is that the arrow points to the value you want to keep. Here, we don’t care about the parentheses, so the arrow points away; but we do want to keep the result of buildExpr
, so the arrows point inwards toward it.
All these operators are left associative; the <
and/or >
points to things which contribute values; it's $
for thing-to-left-is-pure-value and *
for thing-to-left-is-applicative-computation.
My rule of thumb for using these operators goes as follows. First, list the components of your grammatical production and classify them as "signal" or "noise" depending on whether they contribute semantically important information. Here, we have
char '(' -- noise
buildExpr -- signal
char ')' -- noise
Next, figure out what the "semantic function" is, which takes the values of the signal components and gives the value for the whole production. Here, we have
id -- pure semantic function, then a bunch of component parsers
char '(' -- noise
buildExpr -- signal
char ')' -- noise
Now, each component parser will need to be attached to what comes before it with an operator, but which?
<
$
for the first component (as the pure function's just before), or *
for every other component>
if the component is signal or
if it's noise
So that gives us
id -- pure semantic function, then a bunch of parsers
<$ char '(' -- first, noise
<*> buildExpr -- later, signal
<* char ')' -- later, noise
If the semantic function is id
, as here, you can get rid of it and use *>
to glue noise to the front of the signal which is id
's argument. I usually choose not to do that, just so that I can see the semantic function sitting clearly at the beginning of the production. Also, you can build a choice between such productions by interspersing <|>
and you don't need to wrap any of them in parentheses.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With