Introduction
The objective of this article is to demonstrate a common pitfall associated with accessing shared data, without enough attention given to thread-safety.
Background
Thread safety is defined by Wikipedia (and many scholars) as:
Thread safety is a computer programming concept applicable in the context of multi-threaded programs. A piece of code is thread-safe if it functions correctly during simultaneous execution by multiple threads. In particular, it must satisfy the need for multiple threads to access the same shared data, and the need for a shared piece of data to be accessed by only one thread at any given time.
Quite often, this subject is overlooked when an application or service is being designed for multi-threaded, multi-source access. In this article, we will analyze a scenario where we need to store a shared incremental key in a database. We will look at a couple of non-solutions, then I will demonstrate a new feature available in SQL 2008 and above, which substantially simplifies the procedure.
Setting Up a Test Table
Our shared data will allow for incremental keys to be generated for an application and its subsystems, identified by a couple of strings. Let's start by creating a table which will support such scenario.
CREATE TABLE [dbo].[UniqueKeys](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[App] [nvarchar](100) NOT NULL,
[System] [nvarchar](100) NOT NULL,
[Key] [bigint] NOT NULL
) ON [PRIMARY]
The Id
column is there only for good measure, so we can keep track of things as they are changing. An application will be identified by a string
up to 100 characters long, and so will be its subsystem. The final solution must:
- Create a new record for App/System pair with Key value of 1 if it does not exist
- Increment the value of a Key in a record identified by an App/System pair if it does exist
- Guarantee thread safety for multiple internal and external threads
Helper Methods
Let's create two methods, which will help us with getting data and executing update
statements.
#region Constants
private const string CApp = "GetUniqueKeyBadCode";
private const string CSystem = "Shared";
#endregion
#region Static Fields
private static readonly SqlConnectionStringBuilder Scsb = new SqlConnectionStringBuilder
{
DataSource = @"localhost\SQLEXPRESS",
InitialCatalog = "Claims",
IntegratedSecurity = true
};
#endregion
#region Public Methods and Operators
public static void ExecuteSql(string sql)
{
using (var sqlConn = new SqlConnection(Scsb.ConnectionString))
{
using (var sqlCommand = new SqlCommand(sql, sqlConn))
{
sqlConn.Open();
sqlCommand.ExecuteNonQuery();
}
}
}
public static IEnumerable<IDataRecord> GetData(string sql)
{
using (var sqlConn = new SqlConnection(Scsb.ConnectionString))
{
using (var sqlCommand = new SqlCommand(sql, sqlConn))
{
sqlConn.Open();
SqlDataReader reader = sqlCommand.ExecuteReader();
foreach (IDataRecord dataRecord in reader)
{
yield return dataRecord;
}
}
}
}
#endregion
I always prefer using the SqlConnectionStringBuilder
, as I could never memorize the proper naming convention of each parameter. You will have to update the InitialCatalog
property to point it at your database.
Next, we create a simple helper method, ExecuteSql
, which will execute the passed in SQL statement, without expecting any value to be returned. In a production scenario, I'd extend it by requiring all queries to be parameterized, as using dynamic SQL exposes your application to SQL injection attacks. The GetData
helper method returns an IEnumerable
of IDataRecord
, and leaves it up to the calling method to correctly retrieve and process each record and column returned. Again, in a production scenario, this would be parameterized.
You might want to check a tip I have published on this very subject ...
Non-Solution with C#
Our first attempt will be to handle the Key increments in a shared C# method, GetUniqueKey
.
private static long GetUniqueKey()
{
List<IDataRecord> current = GetData(string.Format("SELECT [Key] _
FROM dbo.UniqueKeys WHERE [App]='{0}' And [System]='{1}'", CApp, CSystem)).ToList();
if (current.Any())
{
long currentKey = current.First().GetInt64(0) + 1;
ExecuteSql(string.Format("UPDATE dbo.UniqueKeys SET [Key]={0} _
WHERE [App]='{1}' AND [System]='{2}'", currentKey, CApp, CSystem));
return currentKey;
}
ExecuteSql(string.Format("INSERT INTO [dbo].[UniqueKeys] _
([App],[System],[Key])VALUES('{0}','{1}',1)", CApp, CSystem));
return 1;
}
On the surface, this method looks perfectly fine, and seems to achieve the desired result. Upon first execution, it creates a new record:
Id App System Key
----- ------------------------ ------------ --------------------
1 GetUniqueKeyBadCode Shared 1
On a second execution, it will return 2, and so on, as long as the method is executed sequentially. However, things change dramatically, when you execute the same method from multiple threads. The simplest way to test it is using the Task Parallel Library.
Parallel.For(0, 100, i => GetUniqueKeyBadCode());
Depending on your computer architecture, you will get different results, but in my case, a quad core laptop, I got four identical records:
Id App System Key
-------------------- -------------------- ---------- --------------------
1 GetUniqueKeyBadCode Shared 31
2 GetUniqueKeyBadCode Shared 31
3 GetUniqueKeyBadCode Shared 31
4 GetUniqueKeyBadCode Shared 31
The reason for this result is that when my Parallel loop started, the first four requests, which went to four separate cores, did not find a pre-existing record, thus proceeded to create a new record on four different threads. All subsequent calls were just updating all four, based on App/System pair equality. Also, because all subsequent calls were occurring on four separate threads, the end Key value is 31, because multiple threads were given the same pre-existing value.
Fix for Single Platform App
If your application is single platform, meaning there is only one source of updates to the database, as in one service or one desktop app, the fix for this issue is fairly simple, by enclosing the content of the method in a lock block with shared static readonly object
:
private static readonly object Locker = new object();
private static long GetUniqueKeySinglePlatform()
{
lock (Locker)
{
List<IDataRecord> current = GetData(string.Format("SELECT [Key] _
FROM dbo.UniqueKeys WHERE [App]='{0}' And [System]='{1}'", CApp, CSystem)).ToList();
if (current.Any())
{
long currentKey = current.First().GetInt64(0) + 1;
ExecuteSql(string.Format("UPDATE dbo.UniqueKeys SET [Key]={0} _
WHERE [App]='{1}' AND [System]='{2}'", currentKey, CApp, CSystem));
return currentKey;
}
ExecuteSql(string.Format("INSERT INTO [dbo].[UniqueKeys] _
([App],[System],[Key])VALUES('{0}','{1}',1)", CApp, CSystem));
return 1;
}
}
After executing it in parallel like before, the result is:
Id App System Key
-------------------- -------------------- ---------- --------------------
1 GetUniqueKeyBadCode Shared 100
However, this still does not cover issues associated with running this method on multiple servers at the same time, for example, in a load balanced web service. For that, we must move our unique key generation to the database layer.
Using SQL Based Approach
You might be tempted to replicate the C# logic in SQL. After all, what would be simpler than creating a transaction, and executing few conditional SQL statements. My advise ... don't. You will quickly find out that simply relying on SQL transactions alone, will not guarantee thread safety. Instead, you must leverage a new SQL statement, which allows to execute either INSERT
or UPDATE
, in a single step. The statement is MERGE
, which per Microsoft's documentation:
Performs insert, update, or delete operations on a target table based on the results of a join with a source table. For example, you can synchronize two tables by inserting, updating, or deleting rows in one table based on differences found in the other table.
Here is a sample stored procedure utilizing MERGE
to either INSERT
a new record, or UPDATE
an existing record with a new value.
CREATE PROCEDURE [dbo].[GetUniqueKeyTV]
@app VARCHAR(100) ,
@sys VARCHAR(100) ,
@retVal BIGINT OUT
AS
BEGIN
DECLARE @T TABLE([Key] BIGINT);
SET NOCOUNT ON
MERGE INTO dbo.UniqueKeys WITH (TABLOCKX) AS Target
USING ( VALUES
( @app ,
@sys ,
1
) ) AS Source ( [App], [System], [Key] )
ON ( Target.App = Source.App
AND Target.[System] = Source.[System]
)
WHEN MATCHED
THEN UPDATE
SET Target.[Key] = Target.[Key] + 1
WHEN NOT MATCHED BY TARGET
THEN
INSERT (
[App] ,
[System] ,
[Key]
) VALUES
( Source.[App] ,
Source.[System] ,
Source.[Key]
) OUTPUT
inserted.[Key] INTO @T;
SELECT @retVal = [Key]
FROM @T
SET NOCOUNT OFF
END
GO
In the first part, we tell MERGE
which table will be updated, and we also advise the database engine to use table lock during any operation involving the target table. This ensures that no other records will be updated during execution of the MERGE
statement, at the expense of increased number of concurrent table locks, however, no deadlocks are to be expected, since there is no dependency on any prior step.
Then, we proceed with describing our intent for each scenario. If a record is found matching an App/System pair, we simply increment the Key's value by 1. If a record is not found, we proceed with inserting a new record to the database, giving it the default value of 1. Regardless of either scenario, we store the new value, either 1 or n+1, in a table variable, so we can return it in the retVal OUTPUT
parameter. There are other "OUTPUT
objects" available as a results of executing a MERGE
statement, and I strongly recommend you review its documentation at your own pace.
I have tested this approach with 200 parallel threads, 1000 requests each, using a simple utility written by Adam Mechanic, the SQLQueryStress
, available from here. The final unique key received: 200,000, is exactly what I expected.
Summary
Thread safety is an extremely important, and much debated issue, amongst many developers. Some expect it just to work, somehow magically, others will argue its meaning. To me personally, it simply means that the code I wrote can be safely executed from multiple threads and environments, while returning a logically consistent result.
History
- 6/16/2013 - Initial version