I've been experimenting with the recently open-sourced C# compiler in Roslyn, seeing if I can add language features.
I'm now trying to add some syntax sugar, a new prefix operator that is basically shorthand for a certain pattern. For now I'm piggy-backing on the pre-existing &
"address of", outside of unsafe
contexts.
The pattern I want to expand is as follows: &n
is equivalent to:
Property.Bind(v => n = v, () => n)
The method Property.Bind
is assumed to be available in a library, with the signature:
public static IProperty<T> Bind<T>(Action<T> set, Func<T> get)
So in essence I need to synthesize two lambdas:
n
as an lvalue and does an assignment to it from its parameter, and n
. I then need to make an invocation to Property.Bind
passing those two lambdas as the parameters.My previous experiments have been much easier than this because they were able to piggy back on easy-to-find existing features, so there was practically zero work to do!
But this time I'm struggling to find anything similar to what I'm doing here. So far I've been stepping through how a BoundLambda
is built by the compiler from the source, and it's unfolding into a big ol' mess. I was modifying BindAddressOfExpression
in Binder_Operators.cs
, making it start with an extra if
statement for safe contexts:
private BoundExpression BindAddressOfExpression(PrefixUnaryExpressionSyntax node, DiagnosticBag diagnostics)
{
if (!this.InUnsafeRegion)
{
BoundExpression rValue = BindValue(node.Operand, diagnostics, BindValueKind.RValue);
BoundExpression lValue = BindValue(node.Operand, diagnostics, BindValueKind.Assignment);
var valueParamSymbol = new SourceSimpleParameterSymbol(null, rValue.Type, 0, RefKind.None, "__v", ImmutableArray<Location>.Empty);
var valueParam = new BoundParameter(node, valueParamSymbol);
var assignment = new BoundAssignmentOperator(node, lValue, valueParam, RefKind.None, rValue.Type);
var assignmentStatement = new BoundExpressionStatement(node, assignment);
var assignmentBlock = new BoundBlock(node, ImmutableArray<LocalSymbol>.Empty, ImmutableArray.Create<BoundStatement>(assignmentStatement)) { WasCompilerGenerated = true };
assignmentBlock = FlowAnalysisPass.AppendImplicitReturn(assignmentBlock);
So (presumably!) now I have the assignment block for the first lambda, but getting a complete BoundLambda
around it looks to be a whole new challenge.
I'm wondering: is there a way to "cheat" for this kind of syntactic sugar, by asking the parser/binder to work on a string of C#, as if it had appeared in place of the actual code? That way, manually constructing all the parts and stitching them together won't be necessary. After all, the existing compiler is ideally suited for this!
UPDATED: I've settled on a new class called SyntaxTemplate
, which is immutable and so can be created statically and reused. e.g.
private static readonly SyntaxTemplate _pointerIndirectionTemplate
= new SyntaxTemplate("p.Value");
private static readonly SyntaxTemplate _propertyReferenceTemplate
= new SyntaxTemplate("System.Property.Bind(__v_pr__ => o = __v_pr__, () => o)");
private static readonly SyntaxTemplate _propertyReferenceTypeTemplate
= new SyntaxTemplate("System.IProperty<T>");
private static readonly SyntaxTemplate _enumerableTypeTemplate
= new SyntaxTemplate("System.Collections.Generic.IEnumerable<T>");
It internally has an immutable dictionary of all the identifiers, so any can be replaced by name, e.g. for an expression:
if (!operand.Type.IsPointerType())
return BindExpression(
_pointerIndirectionTemplate.Replace("p", node.Operand).Syntax,
diagnostics);
Or for a type:
if (this.IsIndirectlyInIterator || !this.InUnsafeRegion)
return BindNamespaceOrTypeOrAliasSymbol(
_enumerableTypeTemplate.Replace("T", node.ElementType).Syntax,
diagnostics, basesBeingResolved, suppressUseSiteDiagnostics);
SyntaxTemplate
looks like this:
internal class SyntaxTemplate
{
public ExpressionSyntax Syntax { get; private set; }
private readonly ImmutableDictionary<string, ImmutableList<IdentifierNameSyntax>> _identifiers;
public SyntaxTemplate(string source)
{
Syntax = SyntaxFactory.ParseExpression(source);
var identifiers = ImmutableDictionary<string, ImmutableList<IdentifierNameSyntax>.Builder>.Empty.ToBuilder();
foreach (var node in Syntax.DescendantNodes().OfType<IdentifierNameSyntax>())
{
ImmutableList<IdentifierNameSyntax>.Builder list;
if (!identifiers.TryGetValue(node.Identifier.Text, out list))
list = identifiers[node.Identifier.Text] =
ImmutableList<IdentifierNameSyntax>.Empty.ToBuilder();
list.Add(node);
}
_identifiers = identifiers.ToImmutableDictionary(
p => p.Key, p => p.Value.ToImmutableList());
}
private SyntaxTemplate(ExpressionSyntax syntax,
ImmutableDictionary<string, ImmutableList<IdentifierNameSyntax>> identifiers)
{
Syntax = syntax;
_identifiers = identifiers;
}
public SyntaxTemplate Replace(string identifier, SyntaxNode value)
{
return new SyntaxTemplate(
Syntax.ReplaceNodes(_identifiers[identifier], (o1, o2) => value),
_identifiers.Remove(identifier));
}
}
Because the replacement value is a SyntaxNode
, you can use the node already created by the parser, so there is no wasted effort reparsing the same syntax twice.
YET MORE: This works-ish except that if there are errors in the user's source (e.g. they use the new syntax in a situation where it makes no sense) then the errors generated during binding refer to locations in the template source, which are meaningless in the user's source. So the IDE cannot display red squiggles, etc.
To get around this, you can use a helper method that captures the diagnostics in a temporary bag, and then replays them into the real bag with the location changed to the place in the user's source where your syntax was used:
private T RedirectDiagnostics<T>(DiagnosticBag diagnostics, CSharpSyntaxNode nodeWithLocation, Func<DiagnosticBag, T> generate)
{
var captured = new DiagnosticBag();
var result = generate(captured);
foreach (var diag in captured.AsEnumerable().OfType<DiagnosticWithInfo>())
diagnostics.Add(new CSDiagnostic(diag.Info, nodeWithLocation.Location));
return result;
}
Example usage, just wrapping the first example from above:
if (!operand.Type.IsPointerType())
return RedirectDiagnostics(diagnostics, node, redirected =>
BindExpression(_pointerIndirectionTemplate.Replace("p", node.Operand).Syntax, redirected));
Now the red squiggles work properly (and in a true compilation, the line numbers on the error messages are correct).
I suggest you look at how query expressions "expand" into method calls with compiler-generated lambdas.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With