Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Refactoring do notation into applicative style

Tags:

haskell

So I have been working on a simple expression solver in Haskell. I have been trying to refactor some of my code from do notation to applicative code, mainly because I am want to learn how applicatives work. I am lonst on how to reactor this

factor :: Parser Expr
factor = do
    char '('
    x <- buildExpr
    char ')'
    return x
<|> number
<|> variables
<?> "simple expression"

What would be the way to make this into an applicative style? I have tried the following but it wont type check

factor = pure buildExpr <$> (char '(' *> buildExpr *> char ')')

where buildExper has type Parser Expr.

like image 940
Oceanofthelost Avatar asked Dec 19 '22 12:12

Oceanofthelost


2 Answers

Short answer:

factor = (char '(' *> buildExpr <* char ')') <|> number <|> variables
     <?> "simple expression"

Long answer:

<$> has this type:

(<$>) :: (Functor f) => (a -> b) -> f a -> f b

In other words, it takes a function and a value of a type that is an instance of Functor (and returns something we don’t care about at the moment). Unfortunately, you aren’t giving it a function as the first argument; you’re giving it pure buildExpr, which is a Parser that, when executed, consumes no input and yields buildExpr. If you really wanted to do that, you could, with <*>:

factor = pure buildExpr <$> (char '(' *> buildExpr *> char ')')

That would run pure buildExpr, extract the function out of it, and then run that on the result of (char '(' *> buildExpr *> char ')'). But unfortunately, we can’t do that either: buildExpr is a Parser of some sort, not a function.

If you think about it enough, the thought should pass through your mind: why are we mentioning buildExpr twice if we only want to parse one? It turns out that it is sufficient to mention it only once. In fact, this probably does almost what you want:

factor = char '(' *> buildExpr *> char ')'

Only problem is, it will yield the Char ), not the result of buildExpr. Darn! But looking through the documentation and matching up the types, you should eventually be able to figure out that if you replace the second *> with a <*, it’ll all work out as you want it to:

factor = char '(' *> buildExpr <* char ')'

A good mnemonic for this is that the arrow points to the value you want to keep. Here, we don’t care about the parentheses, so the arrow points away; but we do want to keep the result of buildExpr, so the arrows point inwards toward it.

like image 53
icktoofay Avatar answered Jan 14 '23 11:01

icktoofay


All these operators are left associative; the < and/or > points to things which contribute values; it's $ for thing-to-left-is-pure-value and * for thing-to-left-is-applicative-computation.

My rule of thumb for using these operators goes as follows. First, list the components of your grammatical production and classify them as "signal" or "noise" depending on whether they contribute semantically important information. Here, we have

char '('      -- noise
buildExpr     -- signal
char ')'      -- noise

Next, figure out what the "semantic function" is, which takes the values of the signal components and gives the value for the whole production. Here, we have

id     -- pure semantic function, then a bunch of component parsers
       char '('      -- noise
       buildExpr     -- signal
       char ')'      -- noise

Now, each component parser will need to be attached to what comes before it with an operator, but which?

  • always start with <
  • next $ for the first component (as the pure function's just before), or * for every other component
  • then comes > if the component is signal or if it's noise

So that gives us

id     -- pure semantic function, then a bunch of parsers
   <$  char '('      -- first, noise
   <*> buildExpr     -- later, signal
   <*  char ')'      -- later, noise

If the semantic function is id, as here, you can get rid of it and use *> to glue noise to the front of the signal which is id's argument. I usually choose not to do that, just so that I can see the semantic function sitting clearly at the beginning of the production. Also, you can build a choice between such productions by interspersing <|> and you don't need to wrap any of them in parentheses.

like image 31
pigworker Avatar answered Jan 14 '23 13:01

pigworker