Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot compute minimal length of a parser - uu-parsinglib in Haskell

Lets see the code snippet:

pSegmentBegin p i   = pIndentExact i *> ((:) <$> p i <*> ((pEOL *> pSegment p i) <|> pure []))

if I change this code in my parser to:

pSegmentBegin p i   = do
    pIndentExact i
    ((:) <$> p i <*> ((pEOL *> pSegment p i) <|> pure []))

I've got an error:

canot compute minmal length of a parser due to occurrence of a moadic bind, use addLength to override

I thought the above parser should behave the same way. Why this error can occur?

EDIT

The above example is very simple (to simplify the question) and as noted below it is not necessary to use do notation here, but the real case I wanted it to use is as follows:

pSegmentBegin p i   = do
    j <- pIndentAtLast i
    (:) <$> p j <*> ((pEOL *> pSegments p j) <|> pure [])

I have noticed that adding "addLength 1" before the do statement solves the problem, but I'm unsure if its a correct solution:

pSegmentBegin p i   = addLength 2 $ do
    j <- pIndentAtLast i
    (:) <$> p j <*> ((pEOL *> pSegments p j) <|> pure [])
like image 789
Wojciech Danilo Avatar asked Aug 16 '13 13:08

Wojciech Danilo


1 Answers

As I have mentioned many times the monadic interface should be avoided whenever possible. let me try to explain why the applicative interface is to be preferred.

One of the distinctive features of my library is that it performs error correction by inserting or deleting problems. Of course we can take an umlimited look-ahead here but that would make the process VERY expensive. So we take only a limited lookahead of three steps.

Now suppose we have to insert an expression and one of the expression alternatives is:

expr := "if" expr "then" expr "else" expr

then we want to exclude this alternative since choosing this alternative would necessitate the insertion of another expression etc. So we perform an abstract interpretation of the alternatives and make sure that in case of a draw (i.e. equal costs for the limited lookahead) we take one of the non-recursive alternatives.

Unfortunately this scheme breaks down when one writes monadic parsers, since the length of the right hand side of the bind may depend on the result of the left-hand side. So we issue the error message, and ask some help from the programmer to indicate the number of tokens this alternative might consume. The actual value does not matter so much, as long as you do not provide a finite length for something which is recursive and may lead to infinite insertions. It is only used to select the shortest alternative in case of an insertion.

This abstract interpretation has some costs and if you write all your parsers in monadic style it is unavoidable that this analysis is repeated over an over again. so: DO NOT WRITE MONADIC STYLE PARSERS WHEN USING THIS LIBRARY IF THERE IS AN APPLICATIVE ALTERNATIVE.

like image 112
Doaitse Swierstra Avatar answered Sep 27 '22 16:09

Doaitse Swierstra