Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Roslyn - Find all symbols

I'm using the following code to retrieve all symbols that are used in a block of code. This includes declarations and references to symbols. Unfortunately the GetSymbolInfo call is quite slow, and because of that the total time that this method takes can be long. Is there a way to speed this up?

    public static IEnumerable<ISymbol> GetAllSymbols(CSharpCompilation compilation, SyntaxNode root)
    {
        var noDuplicates = new HashSet<ISymbol>();

        var model = compilation.GetSemanticModel(root.SyntaxTree);

        foreach (var node in root.DescendantNodesAndSelf())
        {
            ISymbol symbol = model.GetDeclaredSymbol(node) ??
                model.GetSymbolInfo(node).Symbol;

            if (symbol != null)
            {
                if (noDuplicates.Add(symbol))
                    yield return symbol; 
            }
        }
    }
like image 729
TWT Avatar asked Oct 16 '25 07:10

TWT


1 Answers

I see several issues with your code.

The first one is a bit of a detail, but you say you are looking for symbols you "use". Do you consider declaring a symbol a kind of using a symbol? If not, you can get rid of model.GetDeclaredSymbol(node).

The second issue is much more important: you are getting the same symbols way too often.

Take for example the following statement:

SomeMethod();

That is an ExpressionStatement node, withing it there is an InvocationExpression and in there there is an IdentifierName. You are calling model.GetSymbolInfo(node) on all three of these nodes. You should look for a way to avoid that.

You would get a lot symbols if you only called model.GetSymbolInfo(node) on nodes of type SimpleNameSyntax (or its descendents, IdentifierNameSyntax and GenericNameSyntax).

Something like:

public static IEnumerable<ISymbol> GetAllSymbols(CSharpCompilation compilation, SyntaxNode root)
{
    var noDuplicates = new HashSet<ISymbol>();

    var model = compilation.GetSemanticModel(root.SyntaxTree);

    foreach (var node in root.DescendantNodesAndSelf())
    {
        switch (node.Kind())
        {
            case SyntaxKind.IdentifierName:
            case SyntaxKind.GenericName:
                ISymbol symbol = model.GetSymbolInfo(node).Symbol;

                if (symbol != null && noDuplicates.Add(symbol))
                {
                    yield return symbol;
                }
                break;
        }
    }
}

It would not get all symbols though. For example, symbols for operators would not be found.

And that brings me to my third point: you really should consider which symbols you are after. Do you really need all symbols?

Even if the answer to that is "yes", you could avoid many cases of redundant lookups, by turning the above logic around.

For example:

public static IEnumerable<ISymbol> GetAllSymbols(CSharpCompilation compilation, SyntaxNode root)
{
    var noDuplicates = new HashSet<ISymbol>();

    var model = compilation.GetSemanticModel(root.SyntaxTree);

    foreach (var node in root.DescendantNodesAndSelf())
    {
        switch (node.Kind())
        {
            case SyntaxKind.ExpressionStatement:
            case SyntaxKind.InvocationExpression:
                break;
            default:
                ISymbol symbol = model.GetSymbolInfo(node).Symbol;

                if (symbol != null)
                {
                    if (noDuplicates.Add(symbol))
                        yield return symbol;
                }
                break;
        }
    }
}

In this cased I filtered out only the ExpressionStatement and InvocationExpression from the example above. There are many more you could safely filter out, but I leave that as an exercise for you.

like image 110
Kris Vandermotten Avatar answered Oct 17 '25 20:10

Kris Vandermotten