Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Monoidal parsing -- what is it?

Tags:

I just stumbled upon the term monoidal parsing from a slide named "Introduction to Monoids" by Edward Kmett. The slide uses haskell throughout.

Now when searching for the term I found nothing but a very few mentions of it and the most from the same author. So I think this term could be explained here.

So, is monoidal parsing something that is interesting and new? Is it appearing anywhere except for in the slide I linked to? And most importantly what is it? The slide itself didn't seem to give a definition nor highlight it that much.

like image 886
Tarrasch Avatar asked Aug 04 '12 12:08

Tarrasch


2 Answers

I'll start with how parsers usually work. Broadly, a parser takes input tokens in sequential order. At some point (typically after all the tokens are exhausted), the parser returns output. One downside to this model is that it's inherently sequential: because the parser operates on a sequence of tokens in order, it's not obvious how to parallelize the process.

However, parsing can be parallelized if we have access to an operation capable of combining partial parse results into a single result. For example, if our language is representable with a context-free grammar, then we could parse every top-level definition separately and in parallel, then assemble the pieces later using the combining operation.

Monoidal parsing simply means that the parser has access to a suitable combining function. A monoid is a structure with a zero element and a binary associative operator. For example, lists form a monoid where the zero is the empty list and the associative operator is concatenation. Remember that associativity means (a++b)++c == a++(b++c). It happens that this is the necessary property to ensure that we're able to recombine parse results in a sensible way. The exact order the sub-parses are recombined doesn't matter, so long as each sub-parse is kept in the proper sequence location.

Of course for actually writing a parallel parser, it's necessary to split up the chunks appropriately as well. If you want to parse top-level definitions in parallel, it's necessary to be able to recognize where that definition begins. This task is usually performed by the parser itself. As I recall, a large portion of his slides cover this topic. Splitting on top-level definitions is pretty coarse-grained; ideally our parser would be able to start from any arbitrary token, then make sense out of the pieces later when the monoid operator is applied.

Unfortunately I can't say if "monoidal parsing" is new, as I'm not particularly familiar with the literature. However I suspect that any models or algorithms for parallel parsing involve a monoid structure, even if it isn't named explicitly. It's been well-known for some time that monoids are suitable for parallelizing problems, so I wouldn't be surprised if the technique is common among parser researchers.

like image 54
John L Avatar answered Sep 19 '22 05:09

John L


Try his other presentation at this page, it's number two after the one that you are reading. It is something new and all I can really do is paraphrase his slides and tell you that it is an attempt to take monadic parsing (like Parsec) and use a weaker algebraic structure so that there is more scope for restructuring the computation. The idea is to improve parallelism.

Edit: the comments on the page suggest the two talks were scheduled back-to-back so perhaps the mention on the slide you saw was a precursor for the second talk.

like image 33
Andrew Avatar answered Sep 23 '22 05:09

Andrew