Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linq implicity typed range variable

Tags:

c#

With Linq, the range variable (e) can be implicitly typed from the array/collection (emps) it is coming from, but a foreach statement cannot do the same thing without the var keyword, or a type. Why is this?

In ex1 the compiler knows e is of type Employee, without giving a var keyword or anything. Why can't the foreach loop in ex2 do the same thing, you have to provide the type (whether its var or some type).

ex1.

    Employee[] emps = {new Employee ( 1, "Daniel", "Cooley", 7, 57.98M };

    public void SortByLastname()
    {
      var sortedByLastname =
            from e in emps
            orderby e.LastName
            select e.FirstName;
    }

ex2.

        foreach (Employee empl in emps)
        {
            Console.WriteLine("Employee " + empl);
        }

This may be over analyzing but i'm trying to get to the bottom of why this is the case.

The answer may very well be Linq query syntax is set up to auto deduce the type of the range variable and the foreach statment is not. Can someone help explain why this is?

like image 567
mgmedick Avatar asked Mar 21 '12 04:03

mgmedick


2 Answers

UPDATE: This question was the subject of my blog on June 25th, 2012. Thanks for the great question!


With Linq, the range variable can be implicitly typed from the collection it is coming from, but a foreach statement cannot do the same thing without the var keyword.

That is correct.

Why is this?

I never know how to answer "why" questions. So I'll pretend you asked a different question:

There are two distinct ways that a named variable may be implicitly typed. A named local variable, for loop variable, foreach loop variable, or using statement variable may be implicitly typed by substituting "var" for its explicit type. A lambda parameter or query range variable may be implicitly typed by omitting its type altogether.

Correct.

That is an inconsistency. A basic design principle is that inconsistency is to be avoided because it is confusing; the user naturally assumes that an inconsistency conveys meaning. Could these features have been made consistent?

Indeed, there are two ways they could have been made consistent. The first is to require "var" everywhere, so that you would say:

Func<double, double> f = (var x)=>Math.Sin(x);
var query = from var customer in customers
            join var order in orders on customer.Id equals ...

All design is a series of compromises. This meets the consistency test but now feels clunky and verbose.

The second is to eliminate "var" everywhere, so that you would say:

x = 12; // Same as "int x = 12;"
using(file = ...) ... 
for(i = 0; i < 10; ++i) ...
foreach(c in customers) ... 

In the first three cases we have now inadvertently added the feature of "implicitly declared locals" rather than "implicitly typed locals". It seems odd and non-C#-like to have a new local variable declared just because you assigned something to a name that was not previously used. This is the sort of feature we'd expect in a language like JScript or VBScript, not C#.

However, in the foreach block it is clear from the context that a local variable is being introduced. We could eliminate "var" here without causing too much confusion, because the "in" is not mistaken for an assignment.

OK, so let's sum up our possible features:

  • Feature 1: require var everywhere.
  • Feature 2: require var nowhere.
  • Feature 3: require var on locals, for loops and usings but not foreach loops, lambdas or range variables
  • Feature 4: require var on locals, for loops using and foreach, but not lambdas or range variables

The first two have the benefits of consistency, but consistency is only one factor. The first one is clunky. The second one is too dynamic and confusing. The third and fourth ones seem like plausible compromises, though they are not consistent.

The question then is: is the foreach loop variable more like a local variable or more like a lambda parameter? Clearly it is more like a local variable; in fact, the foreach loop is specified as a rewrite in which the loop variable becomes a local variable. For consistency with the "for" loop, and consistency with the C# 1.0 and C# 2.0 usage of the foreach loop, which required a type of some kind, we choose option four as superior to option three.

I hope that answers your question. If not, then ask a whole lot more specific question.

like image 156
Eric Lippert Avatar answered Sep 18 '22 17:09

Eric Lippert


The reason you do not need to list the type is because this is being broken down (basically) extension methods. You should be able to rewrite ex2 as:

emps.ForEach(empl=>Console.WriteLine("Employee " + empl);

Notice that you do not need to explicitly say the type as it is inferred from emps

So ex1 will break down to:

emps.OrderBy(e=>e.LastName).Select(e=>e.FirstName);

For a much fuller understanding of this and more, I really suggest buying Jon Skeets book C# In Depth Second Edition

like image 20
Justin Pihony Avatar answered Sep 18 '22 17:09

Justin Pihony