I am using the latest version of Entity Framework on my application (but I don't think EF is the issue here, just stating what ORM we are using) and have this multi-tenant architecture. I was doing some stress tests, built in C#, wherein it creates X-number of tasks that runs in parallel to do some stuff. At some point at the beginning of the whole process, it will create a new database for each task (each tenant in this case) and then continues to process the bulk of the operation. But on some tasks, it throws 2 SQL Exceptions on that exact part of my code where it tries to create a new database.
Exception #1:
Could not obtain exclusive lock on database 'model'. Retry the operation later. CREATE DATABASE failed. Some file names listed could not be created. Check related errors.
Exception #2:
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
It's either of those two and throws on the same line of my code (when EF creates the database). Apparently in SQL Server, when creating a database it does it one at a time and locks the 'model' database (see here) thus some tasks that are waiting throws a timeout or that lock on 'model' error.
Those tests were done on our development SQL Server 2014 instance (12.0.4213) and if I execute, say, 100 parallel tasks there will bound to be an error thrown on some tasks or sometimes even nearly half the tasks I executed.
BUT here's the most disturbing part in all these, when testing it on my other SQL server instance (12.0.2000), which I have installed locally on my PC, no such error throws and completely finishes all the tasks I executed (even 1000 tasks in parallel!).
Solutions I've tried so far but didn't work:
Anyway, here is a simple C# console application you can download and try to replicate the issue. This test app will execute N-number of tasks you input and simply creates a database and does cleanup right afterwards.
Max Degree of Parallelism, also known as MAXDOP, is a server, database, or query level option that determines the maximum number of logical processors that can be used when a query is executed. By default, this option is set to 0, and it means that the query engine can use all available processors.
The queries run in parallel, as far as possible. The database uses different locks for read and write, on rows, blocks or whole tables, depending on what you do. If one query only reads from a table, another query can also read from the same table at the same time.
In some circumstances, the SQL Server Query Optimizer chooses to execute the query using a serial plan, rather than using a parallel plan, due to the expensive cost of the parallel plan versus the serial plan, or because the query contains scalar or relational operators that cannot be run in parallel mode.
No, each query will require its own session. To execute in parallel, each query must be conducted in its own session.
2 observations:
Since the underlying issue has something to do with concurrency, and access to a "resource" which at a key point only allows a single, but not a concurrent, accessor, it's unsurprising that you might be getting differing results on two different machines when executing highly concurrent scenarios under load. Further, SQL Server Engine differences might be involved. All of this is just par for the course for trying to figure out and debug concurrency issues, especially with an engine involved that has its own very strong notions of concurrency.
Rather than going against the grain of the situation by trying to make something work or fully explain a situation, when things are empirically not working, why not change approach by designing for cleaner handling of the problem?
One option: acknowledge the reality of SQL Server's need to have a exclusive lock on model db by regulating access via some kind of concurrency synchronization mechanism--a System.Threading.Monitor
sounds about right for what is happening here and it would allow you to control what happens when there is a timeout, with a timeout of your choosing. This will help prevent the kind of locked up type scenario that may be happening on the SQL Server end, which would be an explanation for the current "timeouts" symptom (although stress load might be the sole explanation).
Another option: See if you can design in such a way that you don't need to synchronize at all. Get to a point where you never request more than one database create simultaneously. Some kind of queue of the create requests--and the queue is guaranteed to be serviced by, say, only one thread--with requesting tasks doing async/await patterns on the result of the creates.
Either way, you are going to have situations where this slows down to a crawl under stress testing, with super stressed loads causing failure. The key questions are:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With