Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compile time errors and unreachable code

Tags:

c#

Ok, consider the following code:

private const int THRESHHOLD = 2;

static void Main(string[] args)
{
     string hello;

     if (THRESHHOLD > 1) return;
     Console.WriteLine(hello);        
}

Suprisingly this code does not throw a "Use of unassigned local variable 'hello'" compile time error. It simply gives a warning "Unreachable code detected".

Even if the code is unreachable, it is still a compile time error, I'd think the right thing to do is throw a compile time error. If I were to do the following:

private const int THRESHHOLD = 2;

static void Main(string[] args)
{
     string hello;

     if (THRESHHOLD > 1) return;
     hello.LMFAO();       
}

Sure enough, I get a "'string' does not contain a definition for 'LMFAO' and no extension method 'LMFAO' accepting a first argument of type 'string' could be found (are you missing a using directive or an assembly reference?)" compile time error.

Why isn't it the same with the use of an unassigned variable?

EDIT Changed the const variable so it is less distracting. I think many are missing the point of the question which is whay depending on which case, compile time errors take precedence over unreachable code.

like image 528
InBetween Avatar asked Feb 23 '12 16:02

InBetween


2 Answers

James Michael Hare's answer gives the de jure explanation: the local variable is definitely assigned, because the code is unreachable and all local variables are definitely assigned in unreachable code. Put another way: the program is only an error if there is a way to observe the state of the uninitialized local variable. In your program there is no way to observe the local, and therefore it is not an error.

Now, I note that the compiler is not required to be infinitely clever. For example:

void M()
{
    int x = 0;
    int y;
    if (x + 0 == x) return;
    Console.WriteLine(y);
}

You know and I know that the last line of the method is unreachable, but the compiler does not know that because the reachability analyzer does not know that zero is the additive identity of integers. The compiler thinks the last line might be reachable, and so gives an error.

For more information on aspects of designing reachability and definite assignment analyzers in programming languages, see my articles on the subject:

http://blogs.msdn.com/b/ericlippert/archive/tags/reachability/

http://blogs.msdn.com/b/ericlippert/archive/tags/definite+assignment/

I note though that no one has answered the deeper question, which is why should the error be suppressed in unreachable code? As you note, we give other semantic analysis errors in unreachable code.

To consider the pros and cons of that decision, you have to think about why someone would have unreachable code in the first place. Either it is intentionally unreachable, or unintentionally unreachable.

If it is unintentionally unreachable then the program contains a bug. The warning already calls attention to the primary problem: the code is unreachable. There is something majorly wrong with the control flow of the method if there is unreachable code. Odds are good that the developer is going to have to make a serious change to the control flow of the method; any local variable analysis we do on the unreachable code is likely to be misleading noise. Let the developer fix the code so that everything is reachable, and then we'll do an analysis of the now-reachable code for control-flow-related errors.

If the unreachable code is unreachable because the developer intended it to be unreachable, then odds are good they are doing something like this:

// If we can Blah, then Frob. However, if we cannot Blah and we can Baz, then Foo.
void M()
{
    int y;
    // TODO: The Blah method has a bug and always throws right now; fix it later.
    if (false /* Blah(out y) */ )
    {
        Frob(y);
    }
    else if (Baz(out y))
    {
        Foo(y);
    }
}

Should Frob(y) be an error in this program?

like image 131
Eric Lippert Avatar answered Oct 20 '22 14:10

Eric Lippert


If you look in the C# language specification in section 5, it states that:

For an initially unassigned variable to be considered definitely assigned at a certain location, an assignment to the variable must occur in every possible execution path leading to that location.

And further in 5.3.3.1:

v is definitely assigned at the beginning of any unreachable statement.

Since non-reachable code is not a possible exeuction path, it's not necessary for it to be assigned to avoid an error.

As to your question why an unknown function is a compiler error in unreachable code and an unassigned variable isn't. You have to consider the standard above. The unreachable code doesn't mean it can't be syntactically valid. The code still has to be compilable, the only difference the unreachable code makes it that it considers all initially unassigned variables assigned at that point. But that doesn't mean you could inject something syntactically invalid like an undefined variable or method.

The error message for unassigned variables gives us a hint too in that it tells us an initially unassigned variable must be assigned before use, but because the code is unreachable, it isn't technically being used...

like image 24
James Michael Hare Avatar answered Oct 20 '22 14:10

James Michael Hare