I'm trying to learn how to use the TPL. I've done a fair bit of reading on the subject but I can't understand why the following code sample breaks, and the one after works?
I've got a unit test running to count the records wrote, and also reading the outputted file in to double check.
Failing:
var tasks = new List<Task>();
Parallel.ForEach(File.ReadLines(featuresLocation), s =>
{
var feature = CreateFeature(s);
if (feature.HasValue)
{
tasks.Add(Task.Factory.StartNew(() =>
{
lock (_locker)
{
featuresWriter.WriteLine(feature.Value);
RecordsWrote++;
}
}));
}
});
Task.WaitAll(tasks.ToArray()); // Breaks
Working:
var tasks = new List<Task>();
Parallel.ForEach(File.ReadLines(featuresLocation), s =>
{
var feature = CreateFeature(s);
if (feature.HasValue)
{
tasks.Add(Task.Factory.StartNew(() =>
{
lock (_locker)
{
featuresWriter.WriteLine(feature.Value);
RecordsWrote++;
}
}));
}
});
Task.WaitAll(tasks.Where(x => x != null).ToArray()); // Works
var tasks = new List<Task>();
Parallel.ForEach(
{
tasks.Add(...);
});
This use of List<Task> tasks
is not thread-safe. You are seeing null
elements where they shouldn't be but run this a little longer and you may see other symptoms and exceptions as well. The behavior is undefined.
Protect the access to tasks
, replace it with a ConcurrentBag
or, my choice, drop those Tasks altogether. You are getting enough parallelism from the Parallel.ForEach()
.
You are seeing a cross-threading issue.
This code block accesses the none-thread safe List<T>
which can lead to unpredictable and none deterministic errors:
tasks.Add(Task.Factory.StartNew(() =>
{
...
}));
You need to lock the tasks.Add call:
lock(tasks)
{
tasks.Add(Task.Factory.StartNew(() =>
{
...
}));
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With