Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Making attoparsec parsers recursive

I've been coding up an attoparsec parser and have been hitting a pattern where I want to turn parsers into recursive parsers (recursively combining them with the monad bind >>= operator).

So I created a function to turn a parser into a recursive parser as follows:

recursiveParser :: (a -> A.Parser a) -> a -> A.Parser a
recursiveParser parser a = (parser a >>= recursiveParser parser) <|> return a

Which is useful if you have a recursive data type like

data Expression = ConsExpr Expression Expression | EmptyExpr

parseRHS :: Expression -> Parser Expression
parseRHS e = ConsExpr e <$> parseFoo

parseExpression :: Parser Expression
parseExpression = parseLHS >>= recursiveParser parseRHS
  where parseLHS = parseRHS EmptyExpr

Is there a more idiomatic solution? It almost seems like recursiveParser should be some kind of fold... I also saw sepBy in the docs, but this method seems to suit me better for my application.

EDIT: Oh, actually now that I think about it should actually be something similar to fix... Don't know how I forgot about that.

EDIT2: Rotsor makes a good point with his alternative for my example, but I'm afraid my AST is actually a bit more complicated than that. It actually looks something more like this (although this is still simplified)

data Segment = Choice1 Expression
             | Choice2 Expression
data Expression = ConsExpr Segment Expression 
                | Token String
                | EmptyExpr

where the string a -> b brackets to the right and c:d brackets to the left, with : binding more tightly than ->.

I.e. a -> b evaluates to

(ConsExpr (Choice1 (Token "a")) (Token "b"))

and c:d evaluates to

(ConsExpr (Choice2 (Token "d")) (Token "c"))

I suppose I could use foldl for the one and foldr for the other but there's still more complexity in there. Note that it's recursive in a slightly strange way, so "a:b:c -> e:f -> :g:h ->" is actually a valid string, but "-> a" and "b:" are not. In the end fix seemed simpler to me. I've renamed the recursive method like so:

fixParser :: (a -> A.Parser a) -> a -> A.Parser a
fixParser parser a = (parser a >>= fixParser parser) <|> pure a

Thanks.

like image 907
Rehno Lindeque Avatar asked Oct 11 '22 17:10

Rehno Lindeque


1 Answers

Why not just parse a list and fold it into whatever you want later? Maybe I am missing something, but this looks more natural to me:

consChain :: [Expression] -> Expression
consChain = foldl ConsExpr EmptyExpr

parseExpression :: Parser Expression
parseExpression = consChain <$> many1 parseFoo

And it's shorter too.

As you can see, consChain is now independent from parsing and can be useful somewhere else. Also, if you separate out the result folding, the somewhat unintuitive recursive parsing simplifies down to many or many1 in this case.

You may want to take a look at how many is implemented too:

many :: (Alternative f) => f a -> f [a]
many v = many_v
    where many_v = some_v <|> pure []
          some_v = (:) <$> v <*> many_v

It has a lot in common with your recursiveParser:

  • some_v is similar to parser a >>= recursiveParser parser
  • many_v is similar to recursiveParser parser

You may ask why I called your recursive parser function unintuitive. This is because this pattern allows parser argument to affect the parsing behaviour (a -> A.Parser a, remember?), which may be useful, but not obviously (I don't see a use case for this yet). The fact that your example does not use this feature makes it look redundant.

like image 127
Rotsor Avatar answered Nov 02 '22 10:11

Rotsor