Using Roslyn to parse/transform/generate code: am I aiming too high, or too low?

Question

(What I'm trying to do is work around the Application.Settings/MVVM problem by generating an interface and wrapper class from the vs-generated settings file.)

What I'd like to do is:

Parse a class declaration from file
Generate an interface declaration based on just the (non static) properties of the class
Generate a wrapper class which implements this interface, takes an instance of the original class in the constructor, and 'pipes' all the properties through to the instance.
Generate another class which implements the interface directly.

My question is two-fold:

Am I barking up the wrong tree? Would I be better off using Code-Dom, T4, Regex(!) for this, or part of this? (I don't mind a bit of extra work, as this is mostly a learning experience.)
If Roslyn is the way to go, which bit of it should I be looking at? I was kind of naively hoping that there would be some way of walking the tree and spitting out just the bits that I want, but I'm having trouble getting my head round whether/how to use the SyntaxRewriter to do it, or whether to use a fluent-style construction, querying the source multiple times for the bits I need.

If you want to comment on the MVVM aspect you can, but that's not the main thrust of the question :)

svick · Accepted Answer

If your requirement is parsing C# source code, then I think Roslyn is a good choice. And if you're going to use it for this part, I think it also makes sense to use it for code generations.

Code generation using Roslyn can be quite verbose (especially when compared with CodeDom), but I think that's not going to be a big issue for you.

I think SyntaxRewriter is best suited for making localized changes in code. But you're asking about parsing whole class and generating types based on that, I think for that, querying the syntax tree directly would work best.

For example, the simplest example of generating a read-only interface for all properties in a class could look something like this:

var originalClass =
    compilationUnit.DescendantNodes().OfType<ClassDeclarationSyntax>().Single();
string originalClassName = originalClass.Identifier.ValueText;
var properties =
    originalClass.DescendantNodes().OfType<PropertyDeclarationSyntax>();

var generatedInterface =
    SyntaxFactory.InterfaceDeclaration('I' + originalClassName)
          .AddMembers(
              properties.Select(
                  p =>
                  SyntaxFactory.PropertyDeclaration(p.Type, p.Identifier)
                        .AddAccessorListAccessors(
                            SyntaxFactory.AccessorDeclaration(SyntaxKind.GetAccessorDeclaration)
                                  .WithSemicolonToken(SyntaxFactory.Token(SyntaxKind.SemicolonToken))))
                        .ToArray());

Jack Ukleja · Answer

On the question of code generation, my advice is to actually use a combination of inline code snippets (parsed using CSharpSyntaxTree.ParseText) and manually generated SyntaxNodes, but with a strong preference for the former. I have also used T4 in the past but am moving away from them due to general lack of integration & capability.

Advantages/disadvantages of each:

Roslyn ParseText

Generates arguably more readable code-generator code.
Allows 'text templating' approach e.g. using C# 6 string interpolation.
Less verbose.
Guarantees valid syntax trees.
Can be more performant.
Easier to get started.
Text can become harder to read than SyntaxNodes if majority is procedural.

Roslyn SyntaxNode building

Better for transforming existing syntax trees - no need to start from scratch.
- But existing trivia can make this confusing/complex.
More verbose. Arguably harder to read and build.
- Syntax trees are often more complex than you imagine
SyntaxFactory API provides guidance on valid syntax.
Roslyn Quoter helps you transform textual code to factory code.
Syntax trees are not necessarily valid.
Code is perhaps more robust once written.

T4 templates

Good if majority of code to be generated is boiler plate.
No proper CI support.
No syntax highlighting or intellisense without 3rd party extensions.
One to one mapping between input and output files.
- Not ideal if you are doing more complex generation e.g. entire class hierarchy based on single input.
Still probably want to use Roslyn to "reflect" on input types, otherwise you will get into trouble with System.Reflection and file locks etc.
Less discoverable API. T4 includes, parameters etc. can be confusing to learn.

Roslyn code-gen tips

If you are only parsing snippets of code e.g. method statements, then you will need to use CSharpParseOptions.Default.WithKind(SourceCodeKind.Script) to get the right syntax nodes back.
If you are parsing a whole block of code for a method body then you will want to parse it as a GlobalStatementSyntax and then access the Statement property as a BlockSyntax.

Use a helper method to parse single SyntaxNodes:

    private static TSyntax ParseText<TSyntax>(string code, bool asScript = false)
    {
        var options = asScript
            ? CSharpParseOptions.Default.WithKind(SourceCodeKind.Script)
            : CSharpParseOptions.Default;

        var syntaxNodes =
            CSharpSyntaxTree.ParseText(code, options)
                .GetRoot()
                .ChildNodes();

        return syntaxNodes.OfType<TSyntax>().First();
    }

When building SyntaxNodes by hand you will typically want to make a final call to SyntaxTree.NormalizeWhitespace(elasticTrivia: true) to make the code "round-trippable".
Typically you will want to use SyntaxNode.ToFullString() to get the actual code text including trivia.
Use SyntaxTree.WithFilePath() as a convenient place to store the eventual file name for when you come to write out the code.
If your goal is to output source files, the end game is to end up with valid CompilationUnitSyntaxs.
Don't forget to pretty-print using Formatter.Format as one of the final steps.

Kevin Pilch · Answer

I think Roslyn is a great way to solve this problem. In terms of what part of Roslyn would I use - I would probably use a SyntaxWalker over the original class, and then use the Fluent API to build up new SyntaxNodes for the new types you want to generate. You may be able to re-use some parts of the original tree in the generated code (for example, the argument lists, etc).

A quick example of what this might look like is:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Roslyn.Compilers;
using Roslyn.Compilers.CSharp;
using Roslyn.Services;
using Roslyn.Services.CSharp;

    class Program
    {
        static void Main(string[] args)
        {
            var syntaxTree = SyntaxTree.ParseText(@"
class C
{
    internal void M(string s, int i)
    {
    }
}");


        }
    }


class Walker : SyntaxWalker
{
    private InterfaceDeclarationSyntax @interface = Syntax.InterfaceDeclaration("ISettings");

    private ClassDeclarationSyntax wrapperClass = Syntax.ClassDeclaration("SettingsWrapper")
        .WithBaseList(Syntax.BaseList(
            Syntax.SeparatedList<TypeSyntax>(Syntax.ParseTypeName("ISettings"))));

    private ClassDeclarationSyntax @class = Syntax.ClassDeclaration("SettingsClass")
        .WithBaseList(Syntax.BaseList(
            Syntax.SeparatedList<TypeSyntax>(Syntax.ParseTypeName("ISettings"))));

    public override void VisitMethodDeclaration(MethodDeclarationSyntax node)
    {
        var parameters = node.ParameterList.Parameters.ToArray();
        var typeParameters = node.TypeParameterList.Parameters.ToArray();
        @interface = @interface.AddMembers(
            Syntax.MethodDeclaration(node.ReturnType, node.Identifier.ToString())
                .AddParameterListParameters(parameters)
                .AddTypeParameterListParameters(typeParameters));

        // More code to add members to the classes too.
    }
}

Using Roslyn to parse/transform/generate code: am I aiming too high, or too low?

Tags:

c#

roslyn

code-generation

Benjol

3 Answers

svick

Advantages/disadvantages of each:

Roslyn ParseText

Roslyn SyntaxNode building

T4 templates

Roslyn code-gen tips

Jack Ukleja

Kevin Pilch

Recent Activity

Donate For Us

Using Roslyn to parse/transform/generate code: am I aiming too high, or too low?

Tags:

c#

roslyn

code-generation

Benjol

3 Answers

svick

Advantages/disadvantages of each:

Roslyn ParseText

Roslyn SyntaxNode building

T4 templates

Roslyn code-gen tips

Jack Ukleja

Kevin Pilch

Related questions

Recent Activity

Donate For Us