Looking through my logs, I can see that my app is vulnerable to deadlocks. They are occurring in many parts of my application.
1) Is there way to replicate this issue. ie: I have only seen this in logs.
2) What is the best/simplest way to retry if the transaction is locked
3) If I wrapped the call in a try/catch. What would the exception type be.
There is a lot written about the issue. I concluded the best option is to try and shorten the transactions as much as possible. Should I change the isolation levels?
deadlocks are very hard to find. If you know why they occur, you may reproduce it in integration tests. In real environments you can use Profiler to observe dead locks. It shows a graph which displays how the deadlock is formed.
You should actually throw away the transaction and start again. The NHibernate session is out of synch after any database exception.
We have a delay before restarting to avoid more stress to the database. It waits for a certain time containing a random number, to avoid that the parallel transactions are synchronizing again.
Reducing Lock Time
If you are using Sql Server, it is very vulnerable to dead locks because of its pessimistic locking mechanism (in contrast to Oracle databases). The newer Snapshot isolation level is something similar to what Oracle is doing and may fix the problem to some degree, but I never used until now so I can't say much about it.
NHibernate fixes the problem as far as possible by caching changes to persistent data and store it at the end of a transaction. But there are some limits and some ways to break it.
Using identity ("auto numbers") as primary keys is probably the most famous mistake. It forces NH to insert entities when they are put into the session which produces a lock of the whole table (in SQL Server).
More complicated to fix is the flushing problem. NH needs to flush changes before executing queries, to ensure consistency. You can get around this by setting FlushMode
to Never
, which may cause consistency problems, so only do it when you exactly know what you do. The best solution is to only use Get
or Load
or navigate to properties of a root entity instead of performing queries in the middle of a transaction.
By doing all this, NH is able to wait for any Insert, Update and Delete command to the database until the end of the transaction. The reduces lock time a lot and therefore it also reduces the risk of dead locks.
General Rules To Avoid Deadlock
The general rules to avoid deadlocks also apply when using NHibernate. Most important: lock resources in a certain order, lock resources not on by one but all at the beginning. The latter is contradictory to what I said above to reduce lock time. It would mean that you lock resources at the beginning of a transaction to make other transactions wait until it is finished. This may reduce deadlocks but also reduces parallel execution.
This is the solution that we opted to use in legacy system where we could not fix the root cause of these deadlocks as it would mean rewriting a lot of existing and poorly documented code. The system was using DataSets and ADO.NET classes, so if you intent to use NHibernate I fear you would have to research its internals and/or develop your own extension or for if existing functionality is not available for that.
1) If the code is prone to deadlocks they should start appearing at sufficient database load. You need many simultaneous connections working with the same tables using the problematic procedures.
It is difficult to reproduce deadlocks in the exact places you want, but if you want general deadlocks for your retry procedure testing do simultaneous reads/inserts into same tables from 10+ threads with differing access order (e.g. table A then B in some of them, table B and then A in others) with small delays and you get one soon.
2) You need to retry the entire code fragment that works with the transaction and retry data initialization too. Meaning, if you are filling datasets within transaction you have to clear them at the beginning of retryable code block.
3) It is .Numer=1205 of SqlException. In general, you can also retry on timeout and network errors:
switch (sqlEx.Number)
{
case 1205:
{
DebugLog("DEADLOCK!");
canRetry = true;
break;
}
case -2:
case -2147217871:
{
DebugLog("TIMEOUT!");
canRetry = true;
break;
}
case 11:
{
DebugLog("NETWORK ERROR!");
canRetry = true;
break;
}
default:
{
DebugLog(string.Format("SQL ERROR: {0}", sqlEx.Number));
break;
}
}
In my experience when retrying on deadlock it is best to discard the connection from the pool with SqlConnection.ClearPool(connection) because it might not be reset properly for the next time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With