Introduction
SQL Server deadlocks are not rare. Most complex systems will have some sort of deadlock. It's practically impossible to avoid deadlocks, and that's why we have deadlock detection built in to SQL Server. SQL Server resolves deadlocks by terminating one of the transactions involved, and the usual victim is the less resource intensive one. Too many deadlocks adversely affect application performance and scalability.
What's a Deadlock
Deadlock is a special kind of blocking scenario where two or more threads are blocked against each other where none can proceed. Consider two transactions T1 and T2 using resources R1 and R2. T1 holds an exclusive lock to R1, and T2 has an exclusive lock on R2. At any point of time in the execution, if T1 requires an exclusive lock to R2 and T2 to R1, both T1 and T2 will be deadlocked. There are a lot of articles on the net explaining deadlocks and how to handle them. Here, I will try to explain how a real time deadlock is tracked down and its solution; of course, the solution is simple once we know the cause.
Problem Definition
The application in question is a simple Remoting server servicing about 300 concurrent users. The application is well written, and follows most of the best practices. Enabling SQL Server tracing revealed a large number of deadlocks and resulting transaction failures. To turn on flag 1204, use DBCC TRACEON (1222, -1). SQL Server 2005 introduces a new and improved version of the tried and tested T1024. If you are using SQL Server 2005, use the T1222 flag. Below is the deadlock information (truncated for clarity) printed by the T1024 flag:
// Any Deadlock encountered .... Printing deadlock information
2007-09-13 16:54:47.15 spid4
2007-09-13 16:54:47.15 spid4 Wait-for graph
2007-09-13 16:54:47.15 spid4
2007-09-13 16:54:47.15 spid4 Node:1
2007-09-13 16:54:47.15 spid4 KEY: 7:645577338:1 (12014f0bec4f)
CleanCnt:2 Mode: Range-S-S Flags: 0x0
2007-09-13 16:54:47.15 spid4 Grant List::
2007-09-13 16:54:47.15 spid4 Owner:0x1a3cc3c0 Mode: Range-S-S Flg:0x0
Ref:2 Life:02000000 SPID:56 ECID:0
2007-09-13 16:54:47.15 spid4 SPID: 56 ECID: 0 Statement
Type: INSERT Line #: 1
2007-09-13 16:54:47.15 spid4 Input Buf: Language
Event: Insert from t_cash_folder
(ADDED_DT ,
2007-09-13 16:54:47.15 spid4 Requested By:
2007-09-13 16:54:47.15 spid4 ResType:LockOwner Stype:'OR'
Mode: Range-Insert-Null SPID:51
ECID:0 Ec:(0x1a4a9570)
Value:0x25af8760 Cost:(0/D0)
2007-09-13 16:54:47.15 spid4
2007-09-13 16:54:47.15 spid4 Node:2
2007-09-13 16:54:47.15 spid4 KEY: 7:645577338:1 (12014f0bec4f)
CleanCnt:2 Mode: Range-S-S Flags: 0x0
2007-09-13 16:54:47.15 spid4 Grant List::
2007-09-13 16:54:47.15 spid4 Owner:0x1a3c4e80 Mode: Range-S-S
Flg:0x0 Ref:2 Life:02000000
SPID:51 ECID:0
2007-09-13 16:54:47.15 spid4 SPID: 51 ECID: 0 Statement
Type: INSERT Line #: 1
2007-09-13 16:54:47.15 spid4 Input Buf: Language
Event: Insert into t_cash_folder (...
2007-09-13 16:54:47.15 spid4 Requested By:
2007-09-13 16:54:47.15 spid4 ResType:LockOwner Stype:'OR'
Mode: Range-Insert-Null SPID:56 ECID:0
Ec:(0x1e4f5570)
Value:0x1a3c56e0 Cost:(0/D0)
2007-09-13 16:54:47.15 spid4 Victim Resource Owner:
2007-09-13 16:54:47.15 spid4 ResType:LockOwner Stype:'OR'
Mode: Range-Insert-Null SPID:56 ECID:0
Ec:(0x1e4f5570)
Value:0x1a3c56e0 Cost:(0/D0)
2007-09-13 16:54:52.15 spid4
//
Analyzing the T1024 Output
I will try to make the analysis as simple as possible. Each node in the deadlock information represents the resources involved. Under each node, we have three sections: KEY, GrantList, and Requested By.
KEY: 7:645577338:1 (12014f0bec4f) CleanCnt:2 Mode: Range-S-S Flags: 0x0
KEY is followed by the ID of the resource as it appears in system tables (systables and sysindexes). The cryptic resourceID KEY have three parts: {KEY : DatabaseID : TableID: IndexType }, where IndexType is 1 for a clustered index, and greater than 1 for non-clustered. 7:645577338:1 (12014f0bec4f)
represents a clustured index (hash=12014f0bec4f) on the table with id 645577338 in Database 7.
Another significant point to be noted here is the lock mode; the Mode attribute on KEY is the highest lock mode granted for the resource. Other attributes of KEY can be ignored.
GrantList
GrantList, as the name suggests, list all the processes which have been granted some kind of lock on the resource in question. It lists the SPID which is holding the lock, lock mode, and the SQL being executed.
Owner:0x1a3cc3c0 Mode: Range-S-S Flg:0x0 Ref:2 Life:02000000 SPID:56 ECID:0 spid4
SPID: 56 ECID: 0 Statement Type: INSERT Line #: 1 spid4 Input Buf: Language
Event: Insert into t_cash_folder (ADDED_DT
SPID 56 is holding the Range-S-S lock on the resource identified by hash 12014f0bec4f, and is trying to execute an INSERT
. Now, the question is why the INSERT
statement is holding a Range-S-S lock since we know that Range-S-S is a serialized range scan which is usually triggered by a SELECT
in serialized isolation mode. Also remember that the INSERT
is not the current statement being executed, but the one which initially requested for the lock. So it follows that there's a SELECT
statement executed in serialized isolation context which requires a RangeS-S lock.
Requested By
This section lists the SPID waiting to acquire a lock on the resource. SPID 51 is waiting to acquire a Range-Insert-Null lock on the same resource. Range-Insert-Null locks are required to test for insertability before the actual insert is made.
Requested By: 2007-09-13 16:54:47.15 spid4 ResType:LockOwner Stype:'OR'
Mode: Range-Insert-Null SPID:51 ECID:0 Ec :(0x1a4a9570) Value:0x25af8760 Cost:(0/D0)
Now we know what's happening. We have SPID 56 holding a Range-S-S lock, and is waiting for a Range-Insert-Null on the same resource, while SPID 51 too holds a Range-S-S lock (Range S-S lock is compatible with itself) and waiting for a Range-Insert-Null. Now the sequence which led to the deadlock:
- SPID 56 acquired a Range-S-S lock on a clustered index B+ tree branch.
- SPID 51 acquired a Range-S-S lock on the same clustered index B+ tree branch.
- SPID 56 requires a Range-Insert-Null lock on the branch, and is waiting for SPID 51 to release its RangeS-S lock.
- SPID 51 requires a Range-Insert-Null lock on the branch, and is waiting for SPID 56 to release its RangeS-S lock.
Solution
The Range-S-S lock held by both the SPIDs suggest that the application is using serialized isolation mode as this locking mode is used when SELECT
is used within serialized isolation mode. serialized isolation is the highest possible isolation mode, and eliminates phantom reads and guarantees repeatable reads. A repeatable read guarantee requires the RangeS-S lock to be held for the scope of the transaction.
You can try changing the isolation mode if serialized isolation mode is not really what the application requires. If you prefer to use System.Transaction
instead of the ADO.NET transaction, remember that the former uses serialized isolation mode as default. Another idea is to think of the possibility of moving the Select
statement outside the transaction. This will eliminate the need for the RangeS-S lock and, of course, the deadlock.
Conclusion
SQL Server is designed to handle deadlocks, and it has a dedicated agent which wakes up now and then to look for deadlocks and resolve them. Your code should be written to handle possible deadlocks and resubmit the same transaction if required. Apart from handling deadlocks in code, simple code rearrangement and consistent database access across applications can avoid many deadlocks. Even though you can't avoid deadlocks altogether, they can be kept to a minimum by following some of the best practices. Remember that deadlocks are scalability killers of your application.