Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scala: parser help

Tags:

parsing

scala

I'm learning to write a simple parser-combinator. I'm writing the rules from bottom up and write unit-tests to verify as I go. However, I'm blocked at using repsep() with whitespace as the separator.

object MyParser extends RegexParsers {
  lazy val listVal:Parser[List[String]]=elem('{')<~repsep("""\d+""".r,"""\s+""".r)~>elem('}')
}

The rule was simplified to illustrate the problem. When I feed the parser with "{1 2 3}", it always complains that it doesn't match:

[1.4] failure: `}' expected but 2 found

I'm wondering what's the correct way of writing a rule as I described?

Thanks

like image 669
EnToutCas Avatar asked May 02 '26 05:05

EnToutCas


1 Answers

By default, RegexParsers-derived parsers skip whitespace before attempting to match any terminal symbol. Unless your whitespace interpretation is unusual, you can just work with that. If the particular character (sequences) you wish to treat as ignored whitespace is something other than the default (\s+), you can override the projected val whiteSpace: Regex = ... value in your RegexParsers parser. If you do not what any such whitespace skipping to occur, override def skipWhitespace = false.

Edit: So yes, changing this:

repsep("""\d+""".r,"""\s+""".r)

to this:

rep("""\d+""".r)

and leaving everything else defined in RegexParsers unchanged should do what you want.

By the way, the common use of repsep is for things like comma-separated lists where you need to ensure the commas are there but don't need to keep them in the resulting parse tree (or AST).

like image 155
Randall Schulz Avatar answered May 03 '26 18:05

Randall Schulz



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!