Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ projection (Select) returning an odd result

Tags:

c#

linq

Consider the following code

namespace ConsoleApp1
{
    using System;
    using System.Collections.Generic;
    using System.Linq;

    public class Program
    {
        public static void Main(string[] args)
        {
            int count = default(int);

            IEnumerable<int> values1 = Enumerable.Range(1, 200)
                .OrderBy(o => Guid.NewGuid())
                .Take(100);

            IEnumerable<int> values2 = values1
                .OrderBy(o => Guid.NewGuid())
                .Take(50)
                .Select(o => { count++; return o; });

            Console.Read();
        }
    }
}

Steps to reproduce

  1. Put a breakpoint on Console.Read();
  2. Run to breakpoint
  3. Inspect count++ (should display 0)
  4. Inspect values2 and populate the Results View
  5. Inspect count++ (should display 100)

Problem

Given that I have only taken 50 items from values1, I would expect count++ to display 50. Why does it display 100?

Please note, if this is confusing, try running this code instead, it produces the same result...

namespace ConsoleApp1
{
    using System;
    using System.Collections.Generic;
    using System.Linq;

    public class Program
    {
        public static void Main(string[] args)
        {
            int count = default(int);

            IEnumerable<int> values1 = Enumerable.Range(1, 100)
                .OrderBy(o => Guid.NewGuid())
                .Take(50);

            IEnumerable<int> values2 = values1
                .OrderBy(o => Guid.NewGuid())
                .Take(50)
                .Select(o => { count++; return o; });

            Console.Read();
        }
    }
}

Example

Inspect count++

enter image description here

Inspect values2 (populate Results View)

enter image description here

Inspect count++

enter image description here

Any explanation as to what is happening here, and how to fix it?

NOTE

Many of the given answers suggest deferred execution. I know linq uses deferred execution, so unless I'm missing something, this is not the issue.

My point is that when the breakpoint is hit, the CLR has created a state machine for values2. Then this is iterated over in the debugger, count increments to 100 immediately for what appears to be only 1 iteration. This seems a little odd!

Also, I am aware that subsequent populations of the results view of value2 cause count to increment since this causes further iterations of the state machine.

like image 761
Matthew Layton Avatar asked Aug 04 '16 13:08

Matthew Layton


2 Answers

Every time you inspect values2, the expression is evaluated again -- and if you inspect it in the watch window, it appears to be evaluated twice each time (don't ask me why; ask the guys who wrote the watch window code). I got count == 300. Every time something evaluates it, it adds 50 to count; that's what the code does, see for yourself. And every time you expand it in the watch window, count increases by 100. Therefore, the watch window evaluates it twice.

You're only seeing one of those times, but so what? Lots of stuff goes on inside the VS code that it doesn't bother to show you. A GUI isn't a window into the internals of the program; it's a bunch of pixels on a screen that some code deliberately colored in. I could write a watch window that evaluates the expression nineteen times and shows you a Pokemon. What's the more plausible explanation: That some code you've never seen is doing something that doesn't happen to be shown in a GUI, or that sometimes your computer can't add?

Look at the runtime type of values2: System.Linq.Enumerable.WhereSelectEnumerableIterator<int, int>. That's no collection, that's something that's waiting to execute.

Let's add ToList() to the end of that expression. That'll evaluate it once and store the results. Then you can inspect the results all day long without executing any LINQ expressions again.

int count = default(int);

IEnumerable<int> values1 = Enumerable.Range(1, 200)
    .OrderBy(o => Guid.NewGuid())
    .Take(100);

IEnumerable<int> values2 = values1
    .OrderBy(o => Guid.NewGuid())
    .Take(50)
    .Select(o => { count++; return o; })
    .ToList();

Now count == 50, because the expression is only evaluated once, and the results are stored in a List<T>.

Moral of the story:

The dots on the screen are an illusion, and combining lazy evaluation with side effects is like setting a monkey loose in Starbucks with a machine gun. I'm not saying it's wrong, it's just not everybody's idea of a fun date.

like image 93
15ee8f99-57ff-4f92-890c-b56153 Avatar answered Nov 15 '22 18:11

15ee8f99-57ff-4f92-890c-b56153


It is because Linq is executed as deffered, until you explicitly call ToList(), or iterate the result, the delegate will not be invoked.

When you view the result of the Projection in quick watch, at that time the delegated is invoked to populate the results as @Ed also mentioned.

like image 42
Ehsan Sajjad Avatar answered Nov 15 '22 17:11

Ehsan Sajjad