Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extending the Mono C# compiler: is there any documentation or precedent?

I am currently involved in some interesting programming language research which has, up until now, centred around extending the upcoming Java 7.0 compiler with some very powerful programmer-productivity-based features. The work should be equally applicable to related programming languages such as C#.

I'm currently scoping out the options for prototyping a C# port of the functionality. I would prefer open-source options so that the fruits of this work can be shared with the broadest-possible audience. Thus the Mono C# compiler seems to be the most obvious starting point. I'm an experienced C# developer so writing the code isn't the problem. I'm mainly concerned about extending the compiler in a maintainable and supported fashion. In the Mono FAQ on the subject (link) it is stated that "Mono has already been used as a foundation for trying out new ideas for the C# language (there are three or four compilers derived from Mono's C# compiler)". Unfortunately, there are no further pointers than this and, so far, Google searches have not turned anything up.

I'm wondering if anybody out there has any information on this. Do mcs/gmcs/dmcs have a standard extensibility model? Specifically, I will be performing some interesting transformations on a program's abstract syntax tree. Is there a standard mechanism for inserting functionality into the compiler chain between abstract syntax tree generation and the type checker and then code generation?

Up until now I've written some ad-hoc extensions to the code (primarily in the code generator) but this doesn't seem to be a maintainable solution especially given that I intend to keep my extensions up to date with the Git trunk of Mono as much as possible. Furthermore it would be nice to be able to make updates to my extensions without having to recompile the whole compiler every time I make a change. I would like to be able to wrap all my AST manipulations into a single .NET assembly that could be dynamically loaded by mcs/gmcs/dmcs without having to hack at the core compiler code directly.

Any thoughts or pointers on extending the Mono C# compiler would be gratefully received!

UPDATES (23 October 2010)

In response to the responses to my question, I decided that I would start working on a branch of Mono in order to create a simple extensibility model for the compiler. It's in its very early stages, but here it is at GitHub:

http://github.com/rcook/mono-extensibility

And the main commit is: http://github.com/rcook/mono-extensibility/commit/a0456c852e48f6822e6bdad7b4d12a357ade0d01

If anybody would be interested in collaborating on this project, please let me know!

like image 732
Richard Cook Avatar asked Oct 03 '10 00:10

Richard Cook


2 Answers

Unfortunately, I cannot adequately answer your question, but if you look at the examples of C# extensions on Miguel de Icaza's blog, you will notice that all of them take the form of patches to the compiler, not plugins or extensions. This seems to indicate that there is no such API.

Note that all of these examples are of much smaller scope than what you seem to be working on:

  • Parameterless Anonymous Methods (this post actually explicitly mentions concerns about the maintainability of such language extensions)
  • String Interpolation
  • Destructuring Assignment for Tuples
  • Syntactic Sugar for IEnumerable

These are mostly localized syntactic sugar, with no "interesting" behavior. The fourth patch, for example, implements Cω's syntactic sugar for IEnumerables, but without any of Cω's semantics that make this syntax interesting. If you look at the patch you can see that it literally does stupid syntactical expansion of ~TIEnumerable<T>, as opposed to Cω, where member access and method invocation are properly lifted over streams.

Microsoft Research's Phoenix Compiler Pipeline was once explicitly touted as the solution to such extensibility problems, but it seems that it now focuses mostly on optimizations and analysis on the IR level in a code generation backend. In fact, I'm not even sure if the project is even still alive.

like image 198
Jörg W Mittag Avatar answered Nov 05 '22 17:11

Jörg W Mittag


The mono C# compiler is a bit of a hack. I spent around a week figuring out how to use information from the parse tree. The compiler does not produce any intermediate representation and code generation may break parts of the parse tree. Still, the parser and tokenizer might prove useful to you and you just take it from there. SharpDevelop also provides a C# parser. The SharpDevelop parser is easier to use than the mono C# parser. If F# also works for you, I would recommended. The source much cleaner than mono and available under open source license.

like image 3
Stefan Savev Avatar answered Nov 05 '22 18:11

Stefan Savev