Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala combinator parser, what does >> mean?

I am little bit confusing about ">>" in scala. Daniel said in Scala parser combinators parsing xml? that it could be used to parameterize the parser base on result from previous parser. Could someone give me some example/hint ? I already read scaladoc but still not understand it.

thanks

like image 476
Tg. Avatar asked Nov 02 '11 10:11

Tg.


People also ask

What is Combinator in Scala?

In Scala, a combinator is a Higher-Order Function and also a pure function. As they are pure functions, we can easily compose them together very well as flexible, fine-grained building blocks for constructing larger, more complex programs. Some of the frequently used Scala combinators are as follows: map. flatMap.

What is parser in Scala?

Parsing in Scala using Combinators There is no Tree type in Scala, instead it should be Option[Any].) Scala interprets each rule in a grammar as a parser function and interprets the EBNF operators as higher-order functions (called combinators) that combine smaller parser functions into larger ones.

What is a parser in Haskell?

In a functional language such as Haskell, parsers can naturally be viewed as functions. type Parser = String Tree. A parser is a function that takes a string and returns some form of tree.


2 Answers

As I said, it serves to parameterize a parser, but let's walk through an example to make it clear.

Let's start with a simple parser, that parses a number follow by a word:

def numberAndWord = number ~ word
def number        = "\\d+".r
def word          = "\\w+".r

Under RegexParsers, this will parse stuff like "3 fruits".

Now, let's say you also want a list of what these "n things" are. For example, "3 fruits: banana, apple, orange". Let's try to parse that to see how it goes.

First, how do I parse "N" things? As it happen, there's a repN method:

def threeThings = repN(3, word)

That will parse "banana apple orange", but not "banana, apple, orange". I need a separator. There's repsep that provides that, but that won't let me specify how many repetitions I want. So, let's provide the separator ourselves:

def threeThings = word ~ repN(2, "," ~> word)

Ok, that words. We can write the whole example now, for three things, like this:

def listOfThings = "3" ~ word ~ ":" ~ threeThings
def word         = "\\w+".r
def threeThings  = word ~ repN(2, "," ~> word)

That kind of works, except that I'm fixing "N" in 3. I want to let the user specify how many. And that's where >>, also known as into (and, yes, it is flatMap for Parser), comes into. First, let's change threeThings:

def things(n: Int) = n match {
  case 1          => word ^^ (List(_))
  case x if x > 1 => word ~ repN(x - 1, "," ~> word) ^^ { case w ~ l => w :: l }
  case x          => err("Invalid repetitions: "+x)
}

This is slightly more complicated than you might have expected, because I'm forcing it to return Parser[List[String]]. But how do I pass a parameter to things? I mean, this won't work:

def listOfThings = number ~ word ~ ":" ~ things(/* what do I put here?*/)

But we can rewrite that like this:

def listOfThings = (number ~ word <~ ":") >> {
  case n ~ what => things(n.toInt)
}

That is almost good enough, except that I now lost n and what: it only returns "List(banana, apple, orange)", not how many there ought to be, and what they are. I can do that like this:

def listOfThings   = (number ~ word <~ ":") >> {
  case n ~ what => things(n.toInt) ^^ { list => new ~(n.toInt, new ~(what, list)) }
}
def number         = "\\d+".r
def word           = "\\w+".r
def things(n: Int) = n match {
  case 1          => word ^^ (List(_))
  case x if x > 1 => word ~ repN(x - 1, "," ~> word) ^^ { case w ~ l => w :: l }
  case x          => err("Invalid repetitions: "+x)
}

Just a final comment. You might have wondered asked yourself "what do you mean flatMap? Isn't that a monad/for-comprehension thingy?" Why, yes, and yes! :-) Here's another way of writing listOfThings:

def listOfThings   = for {
  nOfWhat  <- number ~ word <~ ":"
  n ~ what = nOfWhat
  list     <- things(n.toInt)
}  yield new ~(n.toInt, new ~(what, list))

I'm not doing n ~ what <- number ~ word <~ ":" because that uses filter or withFilter in Scala, which is not implemented by Parsers. But here's even another way of writing it, that doesn't have the exact same semantics, but produce the same results:

def listOfThings   = for {
  n    <- number
  what <- word
  _    <- ":" : Parser[String]
  list <- things(n.toInt)
}  yield new ~(n.toInt, new ~(what, list))

This might even give one to think that maybe the claim that "monads are everywhere" might have something to it. :-)

like image 114
Daniel C. Sobral Avatar answered Sep 27 '22 17:09

Daniel C. Sobral


The method >> takes a function that is given the result of the parser and uses it to contruct a new parser. As stated, this can be used to parameterize a parser on the result of a previous parser.

Example

The following parser parses a line with n + 1 integer values. The first value n states the number of values to follow. This first integer is parsed and then the result of this parse is used to construct a parser that parses n further integers.

Parser definition

The following line assumes, that you can parse an integer with parseInt: Parser[Int]. It first parses an integer value n and then uses >> to parse n additional integers which form the result of the parser. So the initial n is not returned by the parser (though it's the size of the returned list).

def intLine: Parser[Seq[Int]] = parseInt >> (n => repN(n,parseInt))

Valid inputs

1 42
3 1 2 3
0

Invalid inputs

0 1
1
3 42 42
like image 32
ziggystar Avatar answered Sep 27 '22 17:09

ziggystar