Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't "between (char '"') (char '"') (many charLiteral)" work for parsing string literals?

The documentation for Text.Megaparsec.Char.Lexer.charLiteral suggests using char '"' *> manyTill charLiteral (char '"') for parsing string literals (where manyTill is defined in the module Control.Applicative.Combinators in the parser-combinators library).

However, Control.Applicative.Combinators also defines between, which -- as far as I can see -- should do the same as the above suggestion when used like so: between (char '"') (char '"') (many charLiteral).

However, using the between parser above does not work for parsing string literals -- failing with "unexpected end of input. expecting '"' or literal character" (indicating that the ending quote is never detected). Why not?

Also, more generally, why isn't between pBegin pEnd (many p) equivalent to pBegin *> manyTill p pEnd?

like image 201
runeks Avatar asked Mar 03 '23 11:03

runeks


1 Answers

between l r m doesn't do anything spectacular, it really just tries l then m then r and gives back the result of m. So, in between (char '"') (char '"') (many charLiteral), the many charLiteral doesn't know it's not supposed to consume the ". The many just keeps consuming whatever its argument parser accepts... which, because charLiteral just accepts anything, means it churns right through everything until the end of the input. The second char '"' has no way of stopping this, it just needs to make do with what's left... i.e., fail because there is nothing left!

By contrast, manyTill actually checks whether the “till”, matches, and only applies each iteration of the content parser when it doesn't. Therefore, the terminating " is not passed to charLiteral, and you get the desired behaviour.

like image 142
leftaroundabout Avatar answered Mar 04 '23 23:03

leftaroundabout