Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel.Foreach - NULL Tasks

I'm trying to learn how to use the TPL. I've done a fair bit of reading on the subject but I can't understand why the following code sample breaks, and the one after works?

I've got a unit test running to count the records wrote, and also reading the outputted file in to double check.

Failing:

var tasks = new List<Task>();

Parallel.ForEach(File.ReadLines(featuresLocation), s =>
            {
                var feature = CreateFeature(s);
                if (feature.HasValue)
                {
                    tasks.Add(Task.Factory.StartNew(() =>
                        {
                            lock (_locker)
                            {
                                featuresWriter.WriteLine(feature.Value);
                                RecordsWrote++;
                            }
                        }));
                }
            });

        Task.WaitAll(tasks.ToArray()); // Breaks

Working:

var tasks = new List<Task>();

Parallel.ForEach(File.ReadLines(featuresLocation), s =>
            {
                var feature = CreateFeature(s);
                if (feature.HasValue)
                {
                    tasks.Add(Task.Factory.StartNew(() =>
                        {
                            lock (_locker)
                            {
                                featuresWriter.WriteLine(feature.Value);
                                RecordsWrote++;
                            }
                        }));
                }
            });

        Task.WaitAll(tasks.Where(x => x != null).ToArray()); // Works
like image 794
Jamez Avatar asked Feb 15 '23 03:02

Jamez


2 Answers

var tasks = new List<Task>();

Parallel.ForEach( 
{   
   tasks.Add(...);
});

This use of List<Task> tasks is not thread-safe. You are seeing null elements where they shouldn't be but run this a little longer and you may see other symptoms and exceptions as well. The behavior is undefined.

Protect the access to tasks , replace it with a ConcurrentBag or, my choice, drop those Tasks altogether. You are getting enough parallelism from the Parallel.ForEach().

like image 57
Henk Holterman Avatar answered Feb 17 '23 19:02

Henk Holterman


You are seeing a cross-threading issue.

This code block accesses the none-thread safe List<T> which can lead to unpredictable and none deterministic errors:

tasks.Add(Task.Factory.StartNew(() =>
    {
         ...
    }));

You need to lock the tasks.Add call:

lock(tasks)
{
    tasks.Add(Task.Factory.StartNew(() =>
      {
        ...
      }));
}
like image 28
David Roth Avatar answered Feb 17 '23 21:02

David Roth