Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Roslyn to parse/transform/generate code: am I aiming too high, or too low?

(What I'm trying to do is work around the Application.Settings/MVVM problem by generating an interface and wrapper class from the vs-generated settings file.)

What I'd like to do is:

  • Parse a class declaration from file
  • Generate an interface declaration based on just the (non static) properties of the class
  • Generate a wrapper class which implements this interface, takes an instance of the original class in the constructor, and 'pipes' all the properties through to the instance.
  • Generate another class which implements the interface directly.

My question is two-fold:

  • Am I barking up the wrong tree? Would I be better off using Code-Dom, T4, Regex(!) for this, or part of this? (I don't mind a bit of extra work, as this is mostly a learning experience.)
  • If Roslyn is the way to go, which bit of it should I be looking at? I was kind of naively hoping that there would be some way of walking the tree and spitting out just the bits that I want, but I'm having trouble getting my head round whether/how to use the SyntaxRewriter to do it, or whether to use a fluent-style construction, querying the source multiple times for the bits I need.

If you want to comment on the MVVM aspect you can, but that's not the main thrust of the question :)

like image 298
Benjol Avatar asked May 02 '13 12:05

Benjol


3 Answers

If your requirement is parsing C# source code, then I think Roslyn is a good choice. And if you're going to use it for this part, I think it also makes sense to use it for code generations.

Code generation using Roslyn can be quite verbose (especially when compared with CodeDom), but I think that's not going to be a big issue for you.

I think SyntaxRewriter is best suited for making localized changes in code. But you're asking about parsing whole class and generating types based on that, I think for that, querying the syntax tree directly would work best.

For example, the simplest example of generating a read-only interface for all properties in a class could look something like this:

var originalClass =
    compilationUnit.DescendantNodes().OfType<ClassDeclarationSyntax>().Single();
string originalClassName = originalClass.Identifier.ValueText;
var properties =
    originalClass.DescendantNodes().OfType<PropertyDeclarationSyntax>();

var generatedInterface =
    SyntaxFactory.InterfaceDeclaration('I' + originalClassName)
          .AddMembers(
              properties.Select(
                  p =>
                  SyntaxFactory.PropertyDeclaration(p.Type, p.Identifier)
                        .AddAccessorListAccessors(
                            SyntaxFactory.AccessorDeclaration(SyntaxKind.GetAccessorDeclaration)
                                  .WithSemicolonToken(SyntaxFactory.Token(SyntaxKind.SemicolonToken))))
                        .ToArray());
like image 183
svick Avatar answered Oct 16 '22 12:10

svick


On the question of code generation, my advice is to actually use a combination of inline code snippets (parsed using CSharpSyntaxTree.ParseText) and manually generated SyntaxNodes, but with a strong preference for the former. I have also used T4 in the past but am moving away from them due to general lack of integration & capability.

Advantages/disadvantages of each:

Roslyn ParseText

  • Generates arguably more readable code-generator code.
  • Allows 'text templating' approach e.g. using C# 6 string interpolation.
  • Less verbose.
  • Guarantees valid syntax trees.
  • Can be more performant.
  • Easier to get started.
  • Text can become harder to read than SyntaxNodes if majority is procedural.

Roslyn SyntaxNode building

  • Better for transforming existing syntax trees - no need to start from scratch.
    • But existing trivia can make this confusing/complex.
  • More verbose. Arguably harder to read and build.
    • Syntax trees are often more complex than you imagine
  • SyntaxFactory API provides guidance on valid syntax.
  • Roslyn Quoter helps you transform textual code to factory code.
  • Syntax trees are not necessarily valid.
  • Code is perhaps more robust once written.

T4 templates

  • Good if majority of code to be generated is boiler plate.
  • No proper CI support.
  • No syntax highlighting or intellisense without 3rd party extensions.
  • One to one mapping between input and output files.
    • Not ideal if you are doing more complex generation e.g. entire class hierarchy based on single input.
  • Still probably want to use Roslyn to "reflect" on input types, otherwise you will get into trouble with System.Reflection and file locks etc.
  • Less discoverable API. T4 includes, parameters etc. can be confusing to learn.

Roslyn code-gen tips

  • If you are only parsing snippets of code e.g. method statements, then you will need to use CSharpParseOptions.Default.WithKind(SourceCodeKind.Script) to get the right syntax nodes back.
  • If you are parsing a whole block of code for a method body then you will want to parse it as a GlobalStatementSyntax and then access the Statement property as a BlockSyntax.
  • Use a helper method to parse single SyntaxNodes:

        private static TSyntax ParseText<TSyntax>(string code, bool asScript = false)
        {
            var options = asScript
                ? CSharpParseOptions.Default.WithKind(SourceCodeKind.Script)
                : CSharpParseOptions.Default;
    
            var syntaxNodes =
                CSharpSyntaxTree.ParseText(code, options)
                    .GetRoot()
                    .ChildNodes();
    
            return syntaxNodes.OfType<TSyntax>().First();
        }
    
  • When building SyntaxNodes by hand you will typically want to make a final call to SyntaxTree.NormalizeWhitespace(elasticTrivia: true) to make the code "round-trippable".
  • Typically you will want to use SyntaxNode.ToFullString() to get the actual code text including trivia.
  • Use SyntaxTree.WithFilePath() as a convenient place to store the eventual file name for when you come to write out the code.
  • If your goal is to output source files, the end game is to end up with valid CompilationUnitSyntaxs.
  • Don't forget to pretty-print using Formatter.Format as one of the final steps.
like image 6
Jack Ukleja Avatar answered Oct 16 '22 12:10

Jack Ukleja


I think Roslyn is a great way to solve this problem. In terms of what part of Roslyn would I use - I would probably use a SyntaxWalker over the original class, and then use the Fluent API to build up new SyntaxNodes for the new types you want to generate. You may be able to re-use some parts of the original tree in the generated code (for example, the argument lists, etc).

A quick example of what this might look like is:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Roslyn.Compilers;
using Roslyn.Compilers.CSharp;
using Roslyn.Services;
using Roslyn.Services.CSharp;

    class Program
    {
        static void Main(string[] args)
        {
            var syntaxTree = SyntaxTree.ParseText(@"
class C
{
    internal void M(string s, int i)
    {
    }
}");


        }
    }


class Walker : SyntaxWalker
{
    private InterfaceDeclarationSyntax @interface = Syntax.InterfaceDeclaration("ISettings");

    private ClassDeclarationSyntax wrapperClass = Syntax.ClassDeclaration("SettingsWrapper")
        .WithBaseList(Syntax.BaseList(
            Syntax.SeparatedList<TypeSyntax>(Syntax.ParseTypeName("ISettings"))));

    private ClassDeclarationSyntax @class = Syntax.ClassDeclaration("SettingsClass")
        .WithBaseList(Syntax.BaseList(
            Syntax.SeparatedList<TypeSyntax>(Syntax.ParseTypeName("ISettings"))));

    public override void VisitMethodDeclaration(MethodDeclarationSyntax node)
    {
        var parameters = node.ParameterList.Parameters.ToArray();
        var typeParameters = node.TypeParameterList.Parameters.ToArray();
        @interface = @interface.AddMembers(
            Syntax.MethodDeclaration(node.ReturnType, node.Identifier.ToString())
                .AddParameterListParameters(parameters)
                .AddTypeParameterListParameters(typeParameters));

        // More code to add members to the classes too.
    }
}
like image 5
Kevin Pilch Avatar answered Oct 16 '22 12:10

Kevin Pilch