I'm trying to get a parser created from concatenated parsers to backtrack in parsec.
here's the code:
ab = (try $ char 'a') <|> (try $ char 'b')
cd = (try $ char 'b') <|> (try $ char 'c')
abcd = (try $ many1 ab) >> (try cd)
here, when I run
parse abcd "parser"
with the input "aaaaaaaaaaaaaac", it works, but it errors on "aaaaaaaaaaaaab", which should be acceptable.
Any insights as to what I can do to get this to work
Thanks in advance
So you're trying to write the regular expression (a|b)+(b|c)
in Haskell? This is how you would do it:
import Text.Parsec.String
import Text.Parsec
ab :: Parser Char
ab = char 'a' <|> char 'b'
bc :: Parser Char
bc = char 'b' <|> char 'c'
abc :: Parser String
abc = do
a <- ab
b <- bc
return [a,b]
abbc :: Parser String
abbc = try abc <|> do
c <- ab
s <- abbc
return (c:s)
parser :: String -> Either ParseError String
parser = parse abbc "(a|b)+(b|c)"
Now load up ghci
and run the code as follows:
aaditmshah@home:~$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> :load Test.hs
[1 of 1] Compiling Main ( Test.hs, interpreted )
Ok, modules loaded: Main.
*Main> parser "aaaaaaaaaaaaaac"
Loading package array-0.4.0.1 ... linking ... done.
Loading package deepseq-1.3.0.1 ... linking ... done.
Loading package bytestring-0.10.0.2 ... linking ... done.
Loading package transformers-0.3.0.0 ... linking ... done.
Loading package mtl-2.1.2 ... linking ... done.
Loading package text-0.11.3.1 ... linking ... done.
Loading package parsec-3.1.5 ... linking ... done.
Right "aaaaaaaaaaaaaac"
*Main> parser "aaaaaaaaaaaaaab"
Right "aaaaaaaaaaaaaab"
*Main>
Leaving GHCi.
How does this work? Let's understand it parser by parser:
ab :: Parser Char
ab = char 'a' <|> char 'b'
bc :: Parser Char
bc = char 'b' <|> char 'c'
These two parsers are simple. They correspond to the regular expressions (a|b)
and (b|c)
respectively.
abc :: Parser String
abc = do
a <- ab
b <- bc
return [a,b]
This is also relatively simple. It corresponds to the regular expression (a|b)(b|c)
.
abbc :: Parser String
abbc = try abc <|> do
c <- ab
s <- abbc
return (c:s)
This is where things get complicated. This parser does the following:
(a|b)(b|c)
.(a|b)
and then goes back to step 1 to match the rest of the input.Hence it corresponds to the regular expression (a|b)*(a|b)(b|c)
which is the same as (a|b)+(b|c)
.
The try
combinator is important. Without it the parser will try to match the input with (a|b)(b|c)
. Failing to do so it will throw an error instead of backtracking and trying the alternate condition.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With