I have a list of IDs, and I need to run several stored procedures on each ID.
When I am using a standard foreach loop, it works OK, but when I have many records, it works pretty slow.
I wanted to convert the code to work with EF, but I am getting an exception: "The underlying provider failed on Open".
I am using this code, inside the Parallel.ForEach:
using (XmlEntities osContext = new XmlEntities()) { //The code }
But it still throws the exception.
Any idea how can I use Parallel with EF? do I need to create a new context for every procedure I am running? I have around 10 procedures, so I think its very bad to create 10 contexts, one for each.
I have researched this, and I agree that DbContext is not thread-safe. The pattern I propose does use multiple threads, but a single DbContext is only every accessed by a single thread in a single-threaded fashion.
ForEach is like the foreach loop in C#, except the foreach loop runs on a single thread and processing take place sequentially, while the Parallel. ForEach loop runs on multiple threads and the processing takes place in a parallel manner.
DbContext is not thread-safe. Do not share contexts between threads. Make sure to await all async calls before continuing to use the context instance. An InvalidOperationException thrown by EF Core code can put the context into an unrecoverable state.
Parallel. For partitions the work for a number of concurrent iterations. Per default it uses the default task scheduler to schedule the iterations, which essentially uses the current thread as well as a number of thread pool threads. There are overloads that will allow you to change this behavior.
The underlying database connections that the Entity Framework are using are not thread-safe. You will need to create a new context for each operation on another thread that you're going to perform.
Your concern about how to parallelize the operation is a valid one; that many contexts are going to be expensive to open and close.
Instead, you might want to invert how your thinking about parallelizing the code. It seems you're looping over a number of items and then calling the stored procedures in serial for each item.
If you can, create a new Task<TResult>
(or Task
, if you don't need a result) for each procedure and then in that Task<TResult>
, open a single context, loop through all of the items, and then execute the stored procedure. This way, you only have a number of contexts equal to the number of stored procedures that you are running in parallel.
Let's assume you have a MyDbContext
with two stored procedures, DoSomething1
and DoSomething2
, both of which take an instance of a class, MyItem
.
Implementing the above would look something like:
// You'd probably want to materialize this into an IList<T> to avoid // warnings about multiple iterations of an IEnumerable<T>. // You definitely *don't* want this to be an IQueryable<T> // returned from a context. IEnumerable<MyItem> items = ...; // The first stored procedure is called here. Task t1 = Task.Run(() => { // Create the context. using (var ctx = new MyDbContext()) // Cycle through each item. foreach (MyItem item in items) { // Call the first stored procedure. // You'd of course, have to do something with item here. ctx.DoSomething1(item); } }); // The second stored procedure is called here. Task t2 = Task.Run(() => { // Create the context. using (var ctx = new MyDbContext()) // Cycle through each item. foreach (MyItem item in items) { // Call the first stored procedure. // You'd of course, have to do something with item here. ctx.DoSomething2(item); } }); // Do something when both of the tasks are done.
If you can't execute the stored procedures in parallel (each one is dependent on being run in a certain order), then you can still parallelize your operations, it's just a little more complex.
You would look at creating custom partitions across your items (using the static Create
method on the Partitioner
class). This will give you the means to get IEnumerator<T>
implementations (note, this is not IEnumerable<T>
so you can't foreach
over it).
For each IEnumerator<T>
instance you get back, you'd create a new Task<TResult>
(if you need a result), and in the Task<TResult>
body, you would create the context and then cycle through the items returned by the IEnumerator<T>
, calling the stored procedures in order.
That would look like this:
// Get the partitioner. OrdinalPartitioner<MyItem> partitioner = Partitioner.Create(items); // Get the partitions. // You'll have to set the parameter for the number of partitions here. // See the link for creating custom partitions for more // creation strategies. IList<IEnumerator<MyItem>> paritions = partitioner.GetPartitions( Environment.ProcessorCount); // Create a task for each partition. Task[] tasks = partitions.Select(p => Task.Run(() => { // Create the context. using (var ctx = new MyDbContext()) // Remember, the IEnumerator<T> implementation // might implement IDisposable. using (p) // While there are items in p. while (p.MoveNext()) { // Get the current item. MyItem current = p.Current; // Call the stored procedures. Process the item ctx.DoSomething1(current); ctx.DoSomething2(current); } })). // ToArray is needed (or something to materialize the list) to // avoid deferred execution. ToArray();
EF is not thread safe, so you cannot use Parallel.
Take a look at Entity Framework and Multi threading
and this article.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With