Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this method result in an infinite loop?

One of my coworkers came to me with a question about this method that results in an infinite loop. The actual code is a bit too involved to post here, but essentially the problem boils down to this:

private IEnumerable<int> GoNuts(IEnumerable<int> items) {     items = items.Select(item => items.First(i => i == item));     return items; } 

This should (you would think) just be a very inefficient way to create a copy of a list. I called it with:

var foo = GoNuts(new[]{1,2,3,4,5,6}); 

The result is an infinite loop. Strange.

I think that modifying the parameter is, stylistically a bad thing, so I changed the code slightly:

var foo = items.Select(item => items.First(i => i == item)); return foo; 

That worked. That is, the program completed; no exception.

More experiments showed that this works, too:

items = items.Select(item => items.First(i => i == item)).ToList(); return items; 

As does a simple

return items.Select(item => .....); 

Curious.

It's clear that the problem has to do with reassigning the parameter, but only if evaluation is deferred beyond that statement. If I add the ToList() it works.

I have a general, vague, idea of what's going wrong. It looks like the Select is iterating over its own output. That's a little bit strange in itself, because typically an IEnumerable will throw if the collection it's iterating changes.

What I don't understand, because I'm not intimately familiar with the internals of how this stuff works, is why re-assigning the parameter causes this infinite loop.

Is there somebody with more knowledge of the internals who would be willing to explain why the infinite loop occurs here?

like image 464
Jim Mischel Avatar asked Aug 13 '15 16:08

Jim Mischel


People also ask

What causes the loop to be infinite?

Usually, an infinite loop results from a programming error - for example, where the conditions for exit are incorrectly written. Intentional uses for infinite loops include programs that are supposed to run continuously, such as product demo s or in programming for embedded system s.

Why am I getting an infinite loop Python?

An Infinite Loop in Python is a continuous repetitive conditional loop that gets executed until an external factor interferes in the execution flow, like insufficient CPU memory, a failed feature/ error code that stopped the execution, or a new feature in the other legacy systems that needs code integration.

How do you know if its an infinite loop?

In each iteration you calculate a number (the sum of the squares of the digits or the previous number). You can put all those numbers in a Set. If in some iteration you calculate a number that is already in that Set, you know that you are stuck in an infinite loop.

What causes an infinite loop in C++?

The 'nCharPos >= 0' condition is always true because the nCharPos variable has the size_t type. This is an unsigned type, which means that the variable is always above or equal to zero. That's why the loop becomes an infinite one.


2 Answers

The key to answering this is deferred execution. When you do this

items = items.Select(item => items.First(i => i == item)); 

you do not iterate the items array passed into the method. Instead, you assign it a new IEnumerable<int>, which references itself back, and starts iterating only when the caller starts enumerating the results.

That is why all your other fixes have dealt with the problem: all you needed to do is to stop feeding IEnumerable<int> back to itself:

  • Using var foo breaks self-reference by using a different variable,
  • Using return items.Select... breaks self-reference by not using intermediate variables at all,
  • Using ToList() breaks self-reference by avoiding deferred execution: by the time items is re-assigned, old items has been iterated over, so you end up with a plain in-memory List<int>.

But if it's feeding on itself, how does it get anything at all?

That's right, it does not get anything! The moment you try iterating items and ask it for the first item, the deferred sequence asks the sequence fed to it for the first item to process, which means that the sequence is asking itself for the first item to process. At this point, it's turtles all the way down, because in order to return the first item to process the sequence must first get the first item to process from itself.

like image 187
Sergey Kalinichenko Avatar answered Oct 14 '22 16:10

Sergey Kalinichenko


It looks like the Select is iterating over its own output

You are correct. You are returning a query that iterates over itself.

The key is that you reference items within the lambda. The items reference is not resolved ("closed over") until the query iterates, at which point items now references the query instead of the source collection. That's where the self-reference occurs.

Picture a deck of cards with a sign in front of it labelled items. Now picture a man standing beside the deck of cards whose assignment is to iterate the collection called items. But then you move the sign from the deck to the man. When you ask the man for the first "item" - he looks for the collection marked "items" - which is now him! So he asks himself for the first item, which is where the circular reference occurs.

When you assign the result to a new variable, you then have a query that iterates over a different collection, and so does not result in an infinite loop.

When you call ToList, you hydrate the query to a new collection and also do not get an infinite loop.

Other things that would break the circular reference:

  • Hydrating items within the lambda by calling ToList
  • Assigning items to another variable and referencing that within the lambda.
like image 31
D Stanley Avatar answered Oct 14 '22 17:10

D Stanley