I have a 100 records for Parallelization, from 1 to 100, Now I can conveniently use a Parallel.For to execute them in Parallel as follows, which will work based on computing resources
Parallel.For(0, limit, i =>
{
DoWork(i);
});
but there are certain restrictions, each thread need to work with an identical Data entity and there are limited number of Data entities say 10, which are created in advanced by cloning each other and saving them in a structure like Dictionary or List. Now I can restrict the amount of parallelization using the following code:
Parallel.For(0, limit, new ParallelOptions { MaxDegreeOfParallelism = 10 }, i =>
{
DoWork(i);
});
But the issue is how to assign a unique data entity for each incoming thread, such that Data entity is not used by any other current thread in execution, since the number of threads and data entity are same, so starvation is not an issue. I can think of way, in which I create a boolean value for each data entity, specifying whether it's in use or not, thus we iterate through the dictionary or list to find the next available data entity and lock the overall assignment process, so that one thread is assigned a data entity at a given time, but in my view this issue will have much more elegant solution, my version is just a workaround, not really a fix. My logic is:
Parallel.For(0, limit, new ParallelOptions { MaxDegreeOfParallelism = 10 }, i =>
{
lock(All_Threads_Common_Object)
{
Check for available data entity using boolean
Assign the Data entity
}
DoWork(i);
Reset the Boolean value for another thread to use it
});
Please let me know if the question needs further clarification
Use the overload of Parallel.For
which accepts a thread local initialization function.
Parallel.For<DataEntity>(0, limit,
//will run once for each thread
() => GetThreadLocalDataEntity(),
//main loop body, will run once per iteration
(i, loop, threadDataEntity) =>
{
DoWork(i, threadDataEntity);
return threadDataEntity; //we must return it here to adhere to the Func signature.
},
//will run once for each thread after the loop
(threadDataEntity) => threadDataEntity.Dispose() //if necessary
);
The main advantage of this method vs. the one you posted in the question, is that assignment of DataEntity
happens once per thread, not once per loop iteration.
You can use a concurrent collection to store your 10 objects. Each Worker will pull one data entity out, use it, and give it back. Te use of the concurrent collection is important, because in your scenario the normal one is not thread safe.
Like so:
var queue = new ConcurrentQueue<DataEntity>();
// fill the queue with 10 items
Parallel.For(0, limit, new ParallelOptions { MaxDegreeOfParallelism = 10 }, i =>
{
DataEntity x;
if(!queue.TryDequeue(out x))
throw new InvalidOperationException();
DoWork(i, x);
queue.Enqueue(x);
});
Or, if blocking needs to be provided, wrap the thing in a BlockingCollection.
Edit: Do not wrap it in a loop to keep waiting. Rather, use the BlockingCollection like this:
var entities = new BlockingCollection(new ConcurrentQueue<DataEntity>());
// fill the collection with 10 items
Parallel.For(0, limit, new ParallelOptions { MaxDegreeOfParallelism = 10 }, i =>
{
DataEntity x = entities.Take();
DoWork(i, x);
entities.Add(x);
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With