I'm reading the book "LINQ Pocket Reference" and there is a particular example (slightly modified below) that I'm having difficulty getting my head around... The explanation in the book is a bit brief, so I was wondering if someone could break it down step-by-step for me so that it makes sense...
IEnumerable<char> query2 = "Not what you might expect";
foreach (char vowel in "aeiou")
{
var t = vowel;
query2 = query2.Where(c => c != t);
// iterate through query and output (snipped for brevity)
}
Outputs this:
Not wht you might expect Not wht you might xpct Not wht you mght xpct Nt wht yu mght xpct Nt wht y mght xpct
Which makes perfect sense to me... However, this does not.
IEnumerable<char> query2 = "Not what you might expect";
foreach (char vowel in "aeiou")
{
query2 = query2.Where(c => c != vowel);
// iterate through query and output (snipped for brevity)
}
Not wht you might expect Not what you might xpct Not what you mght expect Nt what yu might expect Not what yo might expect
which doesn't...
Can someone give me a better explanation of exactly what is going on here?
In the same way, LINQ is a structured query syntax built in C# and VB.NET to retrieve data from different types of data sources such as collections, ADO.Net DataSet, XML Docs, web service and MS SQL Server and other databases.
LINQ to objects – Allows querying in-memory objects like arrays, lists, generic list and any type of collections. LINQ to XML – Allows querying the XML document by converting the document into XElement objects and then querying using the local execution engine.
Query syntax and method syntax are semantically identical, but many people find query syntax simpler and easier to read. Some queries must be expressed as method calls. For example, you must use a method call to express a query that retrieves the number of elements that match a specified condition.
What happens with the first example is that the value of vowel is captured into a local (to the scope of the for-loop) variable.
The where-clause for the query will then use that captured variable. Where-clauses like this uses an anonymous method/lambda method, which can capture local variables. What happens then is that it captures the current value of the variable.
In the second class, however, it doesn't capture the current value, only which variable to use, and thus since this variable changes, each time you execute the loop, you build a new Where-clause on top of the last one, but you kinda modify all the preceding ones as well since you change the variable.
So in the first example, you get this type of query:
IEnumerable<char> query2 = "Not what you might expect";
Char t1 = 'a'; query2 = query2.Where(c => c != t1);
Char t2 = 'e'; query2 = query2.Where(c => c != t2);
Char t3 = 'i'; query2 = query2.Where(c => c != t3);
Char t4 = 'o'; query2 = query2.Where(c => c != t4);
Char t5 = 'u'; query2 = query2.Where(c => c != t5);
In the second example, you get this:
IEnumerable<char> query2 = "Not what you might expect";
Char vowel = 'a'; query2 = query2.Where(c => c != vowel);
vowel = 'e'; query2 = query2.Where(c => c != vowel);
vowel = 'i'; query2 = query2.Where(c => c != vowel);
vowel = 'o'; query2 = query2.Where(c => c != vowel);
vowel = 'u'; query2 = query2.Where(c => c != vowel);
By the time you execute this second example, the value of vowel
will be 'u', so only the u will be stripped out. You have, however, 5 loops over the same string to strip out the 'u', but only the first one will of course do it.
This capturing of variables is one of the things we all trip over when using anonymous methods/lambdas, and you can read more about it here: C# In Depth: The Beauty of Closures.
If you browse down that page to the text under Comparing capture strategies: complexity vs power, you'll find some examples of this behaviour.
Actually, with rereading it, it makes sense. Using the temp variable means that the temp itself is captured within the query... We are evaluating the loop five times, and therefore there are five instantiated temp variable references for each version of the query.
In the case without the temp variable, there is only the reference to the loop variable.
So five references versus one reference. That's why it produces the results as shown.
In the first case, once it's evaluated the loop totally, the query has used the five references to the temp variables, hence stripping out a, e, i, o and u respectively.
In the second case, it's doing the same thing... only all five references are to the same variable which obviously only contains one value.
Moral of the story: Think "reference" not "value".
So, does this make sense to anyone else now?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With