Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-shortcircuting boolean operators and C# 7 Pattern matching

Tags:

c#

c#-7.0

Im currently writing an C# application, targeting .NET 4.7 (C# 7). I am confused after I tried using the new way of declaring a variable utilizing the "is" keyword: if (variable is MyClass classInstance) This way it works, but when doing:

if (true & variable is MyClass classInstance)
{
    var a = classInstance;
}

Visual Studio (I'm using 2017) shows me the the Error Use of unassigned local variable 'classInstance'. Using the short-circuting version of & (&&) it works fine. Am I missing something about the & operator? (I know using the shortcircuting versions are much more commonly used, but at this point I'm just curious)

like image 476
BMI24 Avatar asked Aug 24 '17 16:08

BMI24


3 Answers

This is due to the rules on definite assignment, which have a special case for &&, but don't have a special case for &. I believe it's working as intended by the C# design team, but that the specification may have a little bit more work to do, at least in terms of clarity.

From the ECMA C# 5 standard, section 5.3.3.24:

For an expression expr of the form expr-first && expr-second:

...

The definite assignment state of v after expr is determined by:

  • If expr-first is a constant expression with the value false, then the definite assignment state of v after expr is the same as the definite assignment state of v after expr-first.
  • Otherwise, if the state of v after expr-first is definitely assigned, then the state of v after expr is definitely assigned.
  • Otherwise, if the state of v after expr-second is definitely assigned, and the state of v after expr-first is “definitely assigned after false expression”, then the state of v after expr is definitely assigned.
  • Otherwise, if the state of v after expr-second is definitely assigned or “definitely assigned after true expression”, then the state of v after expr is “definitely assigned after true expression”.
  • ...

The relevant part for this case is the one I've highlighted in bold. classInstance is "definitely assigned after true expression" with the pattern (expr-second above), and none of the earlier cases apply, so the overall state at the end of the condition is "definitely assigned after true expression". That means within the if statement body, it's definitely assigned.

There's no equivalent clause for the & operator. While there potentially could be, it would be complicated by the types involved - it would have to only apply to the & operator when used with bool expressions, and I don't think most of the rest of definite assignment deals with types in that way.

Note that you don't need to use pattern matching to see this.

Consider the following program:

using System;

class Program
{
    static void Main()
    {
        bool a = false;
        bool x;
        bool y = true;
        
        if (true & (y && (x = a)))
        {
            Console.WriteLine(x);
        }
    }
}

The expression y && (x = a) is another one where x ends up being "definitely assigned after true expression". Again, the code above fails to compile due to x not being definitely assigned, whereas if you change the & to && it compiles. So at least this isn't an issue in pattern matching itself.

What confuses me somewhat is why x isn't still "definitely assigned after true expression" due to 5.3.3.21 ("5.3.3.21 General rules for expressions with embedded expressions"), which contains:

  • The definite assignment state of v at the end of expr is the same as the definite assignment state at the end of exprn.

I suspect this is meant to only include "definitely assigned" or "not definitely assigned", rather than including the "definitely assigned after true expression" part - although it's not as clear as it should be.

like image 69
Jon Skeet Avatar answered Oct 21 '22 09:10

Jon Skeet


This one hurt my head, but I think I have got it figured out.

This confusion is caused by two quirks: the way the is operation leaves the variable undeclared (not null), and the way the compiler optimizes away Boolean, but not bitwise, operations.

Quirk 1. If the cast fails, the variable is unassigned (not null)

Per the documentation for the new is syntax:

If exp is true and is is used with an if statement, varname is assigned and has local scope within the if statement only.

If you read between the lines, this means that if the is cast fails, the variable is considered unassigned. This may be counterintuitive (some might expect it to be null instead). This means that any code within the if block that relies on the variable will not compile if there is any chance the overall if clause could evaluate to true without a type match present. So for example

This compiles:

if (instance is MyClass y)
{
    var x = y;
}

And this compiles:

if (true && instance is MyClass y)
{
    var x = y;
}

But this does not:

void Test(bool f)
{
    if (f && instance is MyClass y)
    {
        var x = y;  //Error: Use of unassigned local variable
    }
}

Quirk 2. Boolean operations are optimized away, binary ones are not

When the compiler detects a predestined Boolean result, the compiler will not emit unreachable code, and skips certain validations as a result. For example:

This compiles:

void Test(bool f)
{
    object neverAssigned;
    if (false && f)
    {
        var x = neverAssigned;  //OK (never executes)
    }
}

But if you use & instead of &&, it does not compile:

void Test(bool f)
{
    object neverAssigned;
    if (false & f)
    {
        var x = neverAssigned;  //Error: Use of unassigned local variable
    }
}

When the compiler sees something like true && it just ignores it completely. Thus

    if (true && instance is MyClass y)

Is exactly the same as

    if (instance is MyClass y)

But this

    if (true & instance is MyClass y)

Is NOT the same. The compiler still needs to emit code that performs the & operation and uses its output in a conditional statement. Or even if it doesn't, the current C# 7 compiler apparently performs the same validations as if it were. This may seem a little strange, but bear in mind that when you use & instead of &&, there is a guarantee that the & must execute, and though it seems unimportant in this example, the general case must allow for additional complexifying factors, such as operator overloading.

How the quirks combine

In the last example, the result of the if clause is determined at run time, not compile time. So the compiler can't be certain that y will end up being assigned before the contents of the if block are executed. Thus you get

    if (true & instance is MyClass y)
    {
        var x = y; //Error: use of unassigned local variable
    }

TLDR

In the situation of a compound logical operation, c# can't be sure that the overall if condition will resolve to true if and only if the cast is successful. Absent that certainty, it can't allow access to the variable, since it might be unassigned. An exception is made when the expression can be reduced to non-compound operation at compile time, for example by removing true &&.

Workaround

I think the way we are meant to use the new is syntax is as a single condition with an if clause. Adding true && at the beginning works because the compiler simply removes it. But anything else combined with the new syntax creates ambiguity about whether the new variable will be in an unassigned state when the code block runs, and the compiler can't allow that.

The workaround of course is to nest your conditions instead of combining them:

Won't work:

void Test(bool f)
{
    if (f & instance is MyClass y)
    {
        var x = y;  //Error: Use of unassigned local variable
    }
}

Works fine:

void Test(bool f)
{
    if (f)
    {
        if (instance is MyClass y)
        {
            var x = y;  //Works
        }
    }
}
like image 37
John Wu Avatar answered Oct 21 '22 09:10

John Wu


Am I missing something about the & operator?

No, I don't think so. Your expectations seem correct to me. Consider this alternative example, which does compile without error:

object variable = null;
MyClass classInstance;

if (true & ((classInstance = variable as MyClass) != null))
{
    var a = classInstance;
}

var b = classInstance;

(To me, it's more interesting to consider the assignment outside the if body, since that's where the short-circuiting would affect behavior.)

With the explicit assignment, the compiler recognizes classInstance as definitely assigned, in the assignments of both a and b. It should be able to do the same thing with the new syntax.

With logical and, short-circuiting or not shouldn't matter. Your first value is true, so the second half should always need to be evaluated to get the whole expression. As you've noted, the compiler does treat the & and && differently though, which is unexpected.

A variation on this is this code:

static void M3()
{
    object variable = null;

    if (true | variable is MyClass classInstance)
    {
        var a = classInstance;
    }
}

The compiler correctly identifies classInstance as not definitely assigned when || is used, but has the same apparent misbehavior with | (i.e. also saying that classInstance is not definitely assigned), even though with the non-short-circuiting operator, the assignment must happen regardless.

Again, the above works correctly with the assignment is explicit, rather than using the new syntax.

If this were just about the definite assignment rules not being addressed with the new syntax, then I would expect && to be as broken as &. But it's not. The compiler handles that correctly. And indeed, in the feature documentation (I hesitate to say "specification", because there's no ECMA-ratified C# 7 specification yet), it reads:

The type_pattern both tests that an expression is of a given type and casts it to that type if the test succeeds. This introduces a local variable of the given type named by the given identifier. That local variable is definitely assigned when the result of the pattern-matching operation is true. [emphasis mine]

Since short-circuiting produces correct behavior without pattern matching, and since pattern matching produces correct behavior without short-circuiting (and definite-assignment is explicitly addressed in the feature description), I would say this is straight-up a compiler bug. There's probably some overlooked interaction between non-short-circuiting Boolean operators and the way the pattern-matched expression is evaluated that causes the definite assignment to get lost in the shuffle.

You should consider reporting it to the authorities. I think these days, the Roslyn GitHub issue-tracking is where they track this sort of thing. It might help if you explain in your report how you found this and why that particular syntax is important in your scenario (since in the code you posted, the && operator works equivalently…the non-short-circuiting & doesn't seem to confer any advantage to the code).

like image 30
Peter Duniho Avatar answered Oct 21 '22 09:10

Peter Duniho