Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dependency injection with parallel processing

I am trying to practice manual dependency injection (no framework yet) to remove tight coupling in my code. This is just for practice - I don't have a specific implementation in mind.

So far simple constructor injection has worked fine.

However I cannot work out how to solve creating a tight coupling when one class must use another within a parallel loop. Example:

public class Processor
{
    private IWorker worker;
    public Processor(IWorker worker)
    {
        this.worker = worker;
    }
    public List<string> DoStuff()
    {
        var list = new List<string>();
        for (int i = 0; i < 10; i++)
        {
            list.Add(worker.GetString());
        }
        return list;
    }

    public List<string> DoStuffInParallel()
    {
        var list = new System.Collections.Concurrent.ConcurrentBag<string>();

        Parallel.For(0, 10, i => 
        { 
            //is there any trivial way to avoid this??
            list.Add(new Worker().GetString());
        });
        return list.ToList();
    }
}

public class Worker : IWorker
{
    public string GetString()
    {
        //pretend this relies on some instance variable, so it not threadsafe
        return "a string";
    }
}

Avoiding the obvious fact that a parallel loop will be slower than a standard loop in the above case, how could i write the Processor.DoStuffInParallel() method to avoid the current hard dependency on the Worker class?

like image 467
Steve Avatar asked Oct 01 '22 12:10

Steve


1 Answers

One way to decouple this is by injecting a factory, e.g.:

public List<string> DoStuffInParallel(IWorkerFactory factory)
{
    var list = new System.Collections.Concurrent.ConcurrentBag<string>();

    Parallel.For(0, 10, i => 
    { 
        list.Add(factory.Create().GetString());
    });
    return list.ToList();
}

The factory could be an container-owned singleton, and the Create() would need to be thread safe.

Note of course that your tasks can't concurrently mutate the list - you'll need to synchronize access when adding the worker result to the list (apols, missed your ConcurrentBag)- In order to reduce contention on the bag, you might also want to look at one of the Parallel.For overloads with localinit / localFinally to do a local aggregation of results into a per-task list, before synchronizing to the aggregated / overall bag in the localFinally.

Edit
Re: Do I need to inject a factory for ConcurrentBag<String> ? IMO, this is fine to create the ConcurrentBag directly - it is an implementation specific detail, rather than a dependency. e.g. a Single threaded implementation may have implemented this as :

return Enumerable.Range(0, 10)
                 .Select(i => factory.Create().GetString())
                 .ToList();

i.e. without any explicit construction of the intermediate container.

You may choose to soften the interface to the method to public IList<string> DoStuffInParallel or even to IEnumerable<string> (the minimum possible contract / commitment). The dependency here is on the Worker, which is what you will want to be able to mock in unit testing.

like image 143
StuartLC Avatar answered Oct 04 '22 21:10

StuartLC