Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multithreading, lambdas and local variables

My question is, in the below code, can I be sure that the instance methods will be accessing the variables I think they will, or can they be changed by another thread while I'm still working? Do closures have anything to do with this, i.e. will I be working on a local copy of the IEnumerable<T> so enumeration is safe?

To paraphrase my question, do I need any locks if I'm never writing to shared variables?

public class CustomerClass
{
    private Config cfg = (Config)ConfigurationManager.GetSection("Customer");

    public void Run()
    {
        var serviceGroups = this.cfg.ServiceDeskGroups.Select(n => n.Group).ToList();

        var groupedData = DataReader.GetSourceData().AsEnumerable().GroupBy(n => n.Field<int>("ID"));
        Parallel.ForEach<IGrouping<int, DataRow>, CustomerDataContext>(
            groupedData,
            () => new CustomerDataContext(),
            (g, _, ctx) =>
            {
                var inter = this.FindOrCreateInteraction(ctx, g.Key);

                inter.ID = g.Key;
                inter.Title = g.First().Field<string>("Title");

                this.CalculateSomeProperty(ref inter, serviceGroups);

                return ctx;
            },
            ctx => ctx.SubmitAllChanges());
    }

    private Interaction FindOrCreateInteraction(CustomerDataContext ctx, int ID)
    {
        var inter = ctx.Interactions.Where(n => n.Id = ID).SingleOrDefault();

        if (inter == null)
        {
            inter = new Interaction();
            ctx.InsertOnSubmit(inter);
        }

        return inter;
    }

    private void CalculateSomeProperty(ref Interaction inter, IEnumerable<string> serviceDeskGroups)
    {
        // Reads from the List<T> class instance variable. Changes the state of the ref'd object.
        if (serviceGroups.Contains(inter.Group))
        {
            inter.Ours = true;
        }
    }
}
like image 927
Vladislav Zorov Avatar asked Dec 05 '11 19:12

Vladislav Zorov


1 Answers

I seem to have found the answer and in the process, also the question.

The real question was whether local "variables", that turn out to be actually objects, can be trusted for concurrent access. The answer is no, if they happen to have internal state that is not handled in a thread-safe manner, all bets are off. The closure doesn't help, it just captures a reference to said object.

In my specific case - concurrent reads from IEnumerable<T> and no writes to it, it is actually thread safe, because each call to foreach, Contains(), Where(), etc. gets a fresh new IEnumerator, which is only visible from the thread that requested it. Any other objects, however, must also be checked, one by one.

So, hooray, no locks or synchronized collections for me :)

Thanks to @ebb and @Dave, although you didn't answer the question directly, you pointed me in the right direction.


If you're interested in the results, this is a run on my home PC (a quad-core) with Thread.SpinWait to simulate the processing time of a row. The real app had an improvement of almost 2X (01:03 vs 00:34) on a dual-core hyper-threaded machine with SQL Server on the local network.

Singlethreaded Single-threaded, using foreach. I don't know why, but there is a pretty high number of cross-core context switches.

Multithreaded Using Parallel.ForEach, lock-free with thread-locals where needed.

like image 52
Vladislav Zorov Avatar answered Oct 27 '22 20:10

Vladislav Zorov