I have a list that I need to parse where the all but the last element needs to be parsed by one parser, and the last element needs to be parsed by another parser.
a = "p1 p1b ... p2"
or
a = "p2"
Originally I tried
parser = do parse1 <- many parser1
parse2 <- parser2
return AParse parse1 parse2
The problem is that parse1 can consume a parse2 input. So parse1 always consumes the entire list, and leave parse2 with nothing.
Is there a way to say to apply parse1 to everything besides the last element in a string, and then apply parse2?
How about:
parseTrain car caboose = choice
[ fmap (:[]) $ try (caboose `endBy` eof),
, liftM2 (:) car (parseTrain car caboose)
[
The eof bugs me, since that makes this parser not compositional. I.e. you couldn't say:
char '(' >> parseTrain p1 p2 >> char ')'
Doing this compsitionally is very hard for a parser. How is it supposed to know to move on to char ')', without trying to at every opportunity and seeing if it fails? Doing so could exponential time.
If you need it to be compositional, does your problem have some additional structure you can exploit? Can you, for example, parse a list of all elements and then process the last one after the fact?
If you can factor parser1
so that is defined like so:
parser1 = (try parser2) <|> parser1extra
Then the problem becomes a list of parser1extra
or parser2
that must end in the later. You can code that as:
parserList =
liftM2 (:) (try parser1extra) parserList
<|>
liftM2 (:) (try parser2) (option [] parserList)
You may or may not need the try
calls depending on if those parsers have any prefix overlap.
If you don't want the return value to be a list, but instead your AParse datum, then you could re-write it this way:
parserList =
do
a <- try parser1extra
prefix a parserList
<|>
do
a <- try parser2
option (AParse [] a) (prefix a parserList)
where prefix a p = do
(AParse as t) <- p
return $ (AParse (a:as) t)
Or, a full example:
import Control.Monad
import Text.ParserCombinators.Parsec
parseNum = do { v <- many1 digit; spaces; return v }
parseWord = do { v <- many1 letter; spaces; return v }
parsePart = parseNum <|> parseWord
parsePartListEndingInWord =
liftM2 (:) (try parseNum) parsePartListEndingInWord
<|>
liftM2 (:) (try parseWord) (option [] parsePartListEndingInWord)
Actually, the calls to try aren't needed in this case, as parseNum
and parseWord
match no common prefix. Notice that parsePartListEndingInWord
doesn't actually reference parsePart
, but instead, the two options that make up parsePart
's definition
(Original answer, solving a somewhat different situation:)
How about something like:
parserTest = between (char '[') (char ']') $ do
p1s <- try parser1 `endBy` char ','
p2 <- parser2
return $ AParse p1s p2
Taking the punctuation out of your parsers and up into parseTest allows you to use the combinators between
and endBy
to do the work for you. Lastly, the try
is there so that if parser1
and parser2
match a common prefix, endBy
will perform the correct full backup to beginning of the common prefix.
Depending on your parsers, it is possible that you can leave the punctuation matching inside your sub-parsers, and all you need might be the a try
around parser1
:
parseTest = do parse1 <- many (try parser1)
parse2 <- parser2
return AParse parse1 parse2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With