I am trying to write a Haksell Parsec Parser that parses input data from a file into the LogLine datatype as follows:
--Final parser that holds the indvidual parsers.
final :: Parser [LogLine]
final = do{ logLines <- sepBy1 logLine eol
; return logLines
}
--The logline token declaration
logLine :: Parser LogLine
logLine = do
name <- plainValue -- parse the name (identifier)
many1 space -- parse and throw away a space
args1 <- bracketedValue -- parse the first arguments
many1 space -- throw away the second sapce
args2 <- bracketedValue -- parse the second list of arguments
many1 space --
constant <- plainValue -- parse the constant identifier
space
weighting <- plainValue --parse the weighting double
space
return $ LogLine name args1 args2 constant weighting
It parses everything just fine, but now I need to add comments to the file, and I have to modify the parser so that it ignores them. It should support single-line comments only beginning with "--" and ending with a '\n' I've tried defining the comment token as follows:
comments :: Parser String
comments = do
string "--"
comment <- (manyTill anyChar newline)
return ""
And then plugging it into the final
parser like so:
final :: Parser [LogLine]
final = do
optional comments
logLines <- sepBy1 logLine (comments<|>newline)
optional comments
return logLines
It compiles fine, but it does not parse. I've tried several minor modifications but the best result was parsing everything up to the first comment, so I'm beginning to think that this is not the way to do it. PS: I've seen this Similar Question, but it is slightly different from what I'm trying to achieve.
If I understand your description of the format in your comment correctly, your example for the format would be
name arg1 arg2 c1 weight
-- comment goes here
optionally followed by further log-lines and/or comments.
Then your problem is that there is a newline between the log-line and the comment line, which means that the comments
part of the separator parser fails - comments
must start with "--"
- without consuming input, so newline
is tried and succeeds. Then the next line begins with "--"
which makes plainValue
fail without consuming input, and thus ends the sepBy1
.
The solution is to let the separator first consume a newline, and then as many comment lines as follow:
final = do
skipMany comments
sepEndBy1 logLine (newline >> skipMany comments)
by allowing the sequence to be ended by a separator (sepEndBy1
instead of sepBy1
), any comment lines after the final LogLine
are automatically skipped.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With