It's very weird because It's very apparent that the loop condition will never cause an Exception
Thread [] threads = new Thread[threadData.Length];
for (int i = 0; i < threadData.Length; i++)
{
threads[i]= new System.Threading.Thread(() => threadWork(threadData[i]));
threads[i].Start();
}
It just causes IndexOutOfBoundsException for threadData[i]
You have captured over the loop variable i
which may cause the last value of 'i' to be used when each thread eventually executes and retrieves data from threadData
. Assign i
to a variable in the loop and use that instead, e.g.:
Thread [] threads = new Thread[threadData.Length];
for (int i = 0; i < threadData.Length; i++)
{
int index = i;
threads[i]= new System.Threading.Thread(() => threadWork(threadData[index]));
threads[i].Start();
}
Eric Lippert has a very nice article on the phenomenon here:
http://blogs.msdn.com/b/ericlippert/archive/2009/11/12/closing-over-the-loop-variable-considered-harmful.aspx
http://blogs.msdn.com/b/ericlippert/archive/2009/11/16/closing-over-the-loop-variable-part-two.aspx
For an insight into why this happens, consider that the threads will execute at some undetermined point in the future, perhaps after the loop has finished. Start
signals that the thread should start, it actually starts asynchronously though i.e. not immediately. Given this, we can see that the lambda passed to the Thread
may execute well past the loop finishing. So how can it have a reference to i
?
In short the compiler will create a helper class that can encapsulate i
, then replace references to i
with this helper class. This allows the lambda to have a reference to i
outside of the scope of the loop. A nice example of compiler magic, but in this case has a non-obvious side-effect i.e. it captures over the loop variable:
private class LambdaHelper
{
public int VarI { get; set; }
}
private static void SomeMethod()
{
LambdaHelper helper = new LambdaHelper();
Thread[] threads = new Thread[threadData.Length];
for (helper.VarI = 0; helper.VarI < data.Length; helper.VarI++)
{
threads[helper.VarI] = new Thread(() => ThreadWork(data[helper.VarI]));
threads[helper.VarI].Start();
}
}
Here we can see that VarI
is used in place of i
. The non-obvious side-effect is that when the threads execute they all see a shared value i.e. VarI
. If the threads start after the loop has finished, they will all see the maximum value of i
.
The fix is to assign i
to a temporary variable inside the loop as detailed in the first code example.
It's the normal problem of loop capturing - you've captured the loop variable, so by the time the thread actually starts, i
is the final value, which is an invalid index into the array. The solution is to create a new variable inside the loop, and capture that instead:
Thread[] threads = new Thread[threadData.Length];
for (int i = 0; i < threadData.Length; i++)
{
int copy = i;
threads[i]= new System.Threading.Thread(() => threadWork(threadData[copy]));
threads[i].Start();
}
You can read more about this on Eric Lippert's blog: part 1; part 2.
Personally I would consider using List<T>
more though, and using foreach
where possible - or even LINQ. foreach
wouldn't have solved this problem, admittedly, but it would be generally cleaner IMO.
Here's an example of how you might have done it in LINQ:
List<Thread> threads = threadData.Select(x => new Thread(() => ThreadWork(x)))
.ToList();
foreach (Thread thread in threads)
{
thread.Start();
}
Or with a straight foreach loop, starting each thread as you go:
List<Thread> threads = new List<Thread>();
foreach (var data in threadData)
{
var dataCopy = data;
Thread thread = new Thread(() => ThreadWork(dataCopy));
thread.Start();
threads.Add(thread);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With