My question is, in the below code, can I be sure that the instance methods will be accessing the variables I think they will, or can they be changed by another thread while I'm still working? Do closures have anything to do with this, i.e. will I be working on a local copy of the IEnumerable<T>
so enumeration is safe?
To paraphrase my question, do I need any locks if I'm never writing to shared variables?
public class CustomerClass
{
private Config cfg = (Config)ConfigurationManager.GetSection("Customer");
public void Run()
{
var serviceGroups = this.cfg.ServiceDeskGroups.Select(n => n.Group).ToList();
var groupedData = DataReader.GetSourceData().AsEnumerable().GroupBy(n => n.Field<int>("ID"));
Parallel.ForEach<IGrouping<int, DataRow>, CustomerDataContext>(
groupedData,
() => new CustomerDataContext(),
(g, _, ctx) =>
{
var inter = this.FindOrCreateInteraction(ctx, g.Key);
inter.ID = g.Key;
inter.Title = g.First().Field<string>("Title");
this.CalculateSomeProperty(ref inter, serviceGroups);
return ctx;
},
ctx => ctx.SubmitAllChanges());
}
private Interaction FindOrCreateInteraction(CustomerDataContext ctx, int ID)
{
var inter = ctx.Interactions.Where(n => n.Id = ID).SingleOrDefault();
if (inter == null)
{
inter = new Interaction();
ctx.InsertOnSubmit(inter);
}
return inter;
}
private void CalculateSomeProperty(ref Interaction inter, IEnumerable<string> serviceDeskGroups)
{
// Reads from the List<T> class instance variable. Changes the state of the ref'd object.
if (serviceGroups.Contains(inter.Group))
{
inter.Ours = true;
}
}
}
I seem to have found the answer and in the process, also the question.
The real question was whether local "variables", that turn out to be actually objects, can be trusted for concurrent access. The answer is no, if they happen to have internal state that is not handled in a thread-safe manner, all bets are off. The closure doesn't help, it just captures a reference to said object.
In my specific case - concurrent reads from IEnumerable<T>
and no writes to it, it is actually thread safe, because each call to foreach
, Contains()
, Where()
, etc. gets a fresh new IEnumerator
, which is only visible from the thread that requested it. Any other objects, however, must also be checked, one by one.
So, hooray, no locks or synchronized collections for me :)
Thanks to @ebb and @Dave, although you didn't answer the question directly, you pointed me in the right direction.
If you're interested in the results, this is a run on my home PC (a quad-core) with Thread.SpinWait
to simulate the processing time of a row. The real app had an improvement of almost 2X (01:03 vs 00:34) on a dual-core hyper-threaded machine with SQL Server on the local network.
Single-threaded, using foreach
. I don't know why, but there is a pretty high number of cross-core context switches.
Using Parallel.ForEach
, lock-free with thread-locals where needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With