I have a MVC website using EF for data access. The app takes in data, runs a series of calculations and stores the results. Each batch of data can have several thousand records and the calculations take on average 30 seconds - I want to run all this in the background.
So far I have Hangfire in place to trigger the batches. I then do:
var queue = new Queue<MyItem>();
// queue is populated ...
while (queue.Any())
{
var item = queue.Dequeue();
var task = Task.Run(() =>
{
using (var context = new MyDbContext())
{
context.MyItem.Add(item);
// Run Calculations
try {
context.SaveChanges();
}
catch {
// Log error
}
}
}
}
When a batch is running the site either becomes completely unresponsive, or I receive 'The underlying provider failed on Open' errors.
Is there a better approach to this?
It seems that you're creating tasks using Task.Run
and not waiting for them to complete. That means you'll generate a task for each item in the queue that will all run concurrently on different ThreadPool
threads. This can be quite a burden that can (and probably does) affect your regular requests.
You should limit the concurrency of these task in some way. The simplest IMO is using TPL Dataflow's ActionBlock
. You create the block with a delegate and options (e.g. MaxDegreeOfParallelism
), post items into it and wait for it to complete:
block = new ActionBlock<MyItem>(item =>
{
using (var context = new MyDbContext())
{
context.MyItem.Add(item);
// Run Calculations
try {
context.SaveChanges();
}
catch {
// Log error
}
}
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 2 });
while (queue.Any())
{
var item = queue.Dequeue();
block.Post(item);
}
block.Complete();
await block.Completion;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With