Thread-safety with SQL Server 2008

Darek Danielewski

4.86/5 (6 votes)

16 Jun 2013CPOL6 min read

34.8K

The subject of thread safety is often overlooked by even experienced developers. When designing an application or service to be utilized by multiple internal and external threads, one must account for potential issues arising from concurrent access to shared data.

Introduction

The objective of this article is to demonstrate a common pitfall associated with accessing shared data, without enough attention given to thread-safety.

Background

Thread safety is defined by Wikipedia (and many scholars) as:

Thread safety is a computer programming concept applicable in the context of multi-threaded programs. A piece of code is thread-safe if it functions correctly during simultaneous execution by multiple threads. In particular, it must satisfy the need for multiple threads to access the same shared data, and the need for a shared piece of data to be accessed by only one thread at any given time.

Quite often, this subject is overlooked when an application or service is being designed for multi-threaded, multi-source access. In this article, we will analyze a scenario where we need to store a shared incremental key in a database. We will look at a couple of non-solutions, then I will demonstrate a new feature available in SQL 2008 and above, which substantially simplifies the procedure.

Setting Up a Test Table

Our shared data will allow for incremental keys to be generated for an application and its subsystems, identified by a couple of strings. Let's start by creating a table which will support such scenario.

SQL

CREATE TABLE [dbo].[UniqueKeys](
    [Id] [bigint] IDENTITY(1,1) NOT NULL,
    [App] [nvarchar](100) NOT NULL,
    [System] [nvarchar](100) NOT NULL,
    [Key] [bigint] NOT NULL
) ON [PRIMARY]

The Id column is there only for good measure, so we can keep track of things as they are changing. An application will be identified by a string up to 100 characters long, and so will be its subsystem. The final solution must:

Create a new record for App/System pair with Key value of 1 if it does not exist
Increment the value of a Key in a record identified by an App/System pair if it does exist
Guarantee thread safety for multiple internal and external threads

Helper Methods

Let's create two methods, which will help us with getting data and executing update statements.

#region Constants

private const string CApp = "GetUniqueKeyBadCode";

private const string CSystem = "Shared";

#endregion

#region Static Fields

private static readonly SqlConnectionStringBuilder Scsb = new SqlConnectionStringBuilder
                                                              {
                                                                  DataSource = @"localhost\SQLEXPRESS",
                                                                  InitialCatalog = "Claims",
                                                                  IntegratedSecurity = true
                                                              };

#endregion

#region Public Methods and Operators

public static void ExecuteSql(string sql)
{
    using (var sqlConn = new SqlConnection(Scsb.ConnectionString))
    {
        using (var sqlCommand = new SqlCommand(sql, sqlConn))
        {
            sqlConn.Open();
            sqlCommand.ExecuteNonQuery();
        }
    }
}

public static IEnumerable<IDataRecord> GetData(string sql)
{
    using (var sqlConn = new SqlConnection(Scsb.ConnectionString))
    {
        using (var sqlCommand = new SqlCommand(sql, sqlConn))
        {
            sqlConn.Open();
            SqlDataReader reader = sqlCommand.ExecuteReader();
            foreach (IDataRecord dataRecord in reader)
            {
                yield return dataRecord;
            }
        }
    }
}

#endregion

I always prefer using the SqlConnectionStringBuilder, as I could never memorize the proper naming convention of each parameter. You will have to update the InitialCatalog property to point it at your database.

Next, we create a simple helper method, ExecuteSql, which will execute the passed in SQL statement, without expecting any value to be returned. In a production scenario, I'd extend it by requiring all queries to be parameterized, as using dynamic SQL exposes your application to SQL injection attacks. The GetData helper method returns an IEnumerable of IDataRecord, and leaves it up to the calling method to correctly retrieve and process each record and column returned. Again, in a production scenario, this would be parameterized.

You might want to check a tip I have published on this very subject ...

Non-Solution with C#

Our first attempt will be to handle the Key increments in a shared C# method, GetUniqueKey.

private static long GetUniqueKey()
{
    List<IDataRecord> current = GetData(string.Format("SELECT [Key] _
    FROM dbo.UniqueKeys WHERE [App]='{0}' And [System]='{1}'", CApp, CSystem)).ToList();
    if (current.Any())
    {
        //Record found
        long currentKey = current.First().GetInt64(0) + 1;
        ExecuteSql(string.Format("UPDATE dbo.UniqueKeys SET [Key]={0} _
        WHERE [App]='{1}' AND [System]='{2}'", currentKey, CApp, CSystem));
        return currentKey;
    }
    //Record not found
    ExecuteSql(string.Format("INSERT INTO [dbo].[UniqueKeys] _
    ([App],[System],[Key])VALUES('{0}','{1}',1)", CApp, CSystem));
    return 1;
}

On the surface, this method looks perfectly fine, and seems to achieve the desired result. Upon first execution, it creates a new record:

Id    App                      System       Key
----- ------------------------ ------------ --------------------
1     GetUniqueKeyBadCode      Shared       1

On a second execution, it will return 2, and so on, as long as the method is executed sequentially. However, things change dramatically, when you execute the same method from multiple threads. The simplest way to test it is using the Task Parallel Library.

Parallel.For(0, 100, i => GetUniqueKeyBadCode());

Depending on your computer architecture, you will get different results, but in my case, a quad core laptop, I got four identical records:

Id                   App                  System     Key
-------------------- -------------------- ---------- --------------------
1                    GetUniqueKeyBadCode  Shared     31
2                    GetUniqueKeyBadCode  Shared     31
3                    GetUniqueKeyBadCode  Shared     31
4                    GetUniqueKeyBadCode  Shared     31

The reason for this result is that when my Parallel loop started, the first four requests, which went to four separate cores, did not find a pre-existing record, thus proceeded to create a new record on four different threads. All subsequent calls were just updating all four, based on App/System pair equality. Also, because all subsequent calls were occurring on four separate threads, the end Key value is 31, because multiple threads were given the same pre-existing value.

Fix for Single Platform App

If your application is single platform, meaning there is only one source of updates to the database, as in one service or one desktop app, the fix for this issue is fairly simple, by enclosing the content of the method in a lock block with shared static readonly object:

private static readonly object Locker = new object();
private static long GetUniqueKeySinglePlatform()
{
    lock (Locker)
    {
        List<IDataRecord> current = GetData(string.Format("SELECT [Key] _
        FROM dbo.UniqueKeys WHERE [App]='{0}' And [System]='{1}'", CApp, CSystem)).ToList();
        if (current.Any())
        {
            //Record found
            long currentKey = current.First().GetInt64(0) + 1;
            ExecuteSql(string.Format("UPDATE dbo.UniqueKeys SET [Key]={0} _
            WHERE [App]='{1}' AND [System]='{2}'", currentKey, CApp, CSystem));
            return currentKey;
        }
        //Record not found
        ExecuteSql(string.Format("INSERT INTO [dbo].[UniqueKeys] _
        ([App],[System],[Key])VALUES('{0}','{1}',1)", CApp, CSystem));
        return 1;
    }
}

After executing it in parallel like before, the result is:

Id                   App                  System     Key
-------------------- -------------------- ---------- --------------------
1                    GetUniqueKeyBadCode  Shared     100

However, this still does not cover issues associated with running this method on multiple servers at the same time, for example, in a load balanced web service. For that, we must move our unique key generation to the database layer.

Using SQL Based Approach

You might be tempted to replicate the C# logic in SQL. After all, what would be simpler than creating a transaction, and executing few conditional SQL statements. My advise ... don't. You will quickly find out that simply relying on SQL transactions alone, will not guarantee thread safety. Instead, you must leverage a new SQL statement, which allows to execute either INSERT or UPDATE, in a single step. The statement is MERGE, which per Microsoft's documentation:

Performs insert, update, or delete operations on a target table based on the results of a join with a source table. For example, you can synchronize two tables by inserting, updating, or deleting rows in one table based on differences found in the other table.

Here is a sample stored procedure utilizing MERGE to either INSERT a new record, or UPDATE an existing record with a new value.

SQL

CREATE PROCEDURE [dbo].[GetUniqueKeyTV]
    @app VARCHAR(100) ,
    @sys VARCHAR(100) ,
    @retVal BIGINT OUT
AS 
    BEGIN
        DECLARE @T TABLE([Key] BIGINT);
        SET NOCOUNT ON

        MERGE INTO dbo.UniqueKeys WITH (TABLOCKX) AS Target
            USING ( VALUES
                ( @app ,
                  @sys ,
                  1
                ) ) AS Source ( [App], [System], [Key] )
            ON ( Target.App = Source.App
                 AND Target.[System] = Source.[System]
               )
            WHEN MATCHED 
                THEN UPDATE
                    SET     Target.[Key] = Target.[Key] + 1 
            WHEN NOT MATCHED BY TARGET 
                THEN 
       INSERT  (
              [App] ,
              [System] ,
              [Key]
                    
            ) VALUES
            ( Source.[App] ,
              Source.[System] ,
              Source.[Key]
                    
            ) OUTPUT 
               inserted.[Key] INTO @T;
        SELECT  @retVal = [Key]
        FROM    @T

        SET NOCOUNT OFF
    END

GO

In the first part, we tell MERGE which table will be updated, and we also advise the database engine to use table lock during any operation involving the target table. This ensures that no other records will be updated during execution of the MERGE statement, at the expense of increased number of concurrent table locks, however, no deadlocks are to be expected, since there is no dependency on any prior step.

Then, we proceed with describing our intent for each scenario. If a record is found matching an App/System pair, we simply increment the Key's value by 1. If a record is not found, we proceed with inserting a new record to the database, giving it the default value of 1. Regardless of either scenario, we store the new value, either 1 or n+1, in a table variable, so we can return it in the retVal OUTPUT parameter. There are other "OUTPUT objects" available as a results of executing a MERGE statement, and I strongly recommend you review its documentation at your own pace.

I have tested this approach with 200 parallel threads, 1000 requests each, using a simple utility written by Adam Mechanic, the SQLQueryStress, available from here. The final unique key received: 200,000, is exactly what I expected.

Summary

Thread safety is an extremely important, and much debated issue, amongst many developers. Some expect it just to work, somehow magically, others will argue its meaning. To me personally, it simply means that the code I wrote can be safely executed from multiple threads and environments, while returning a logically consistent result.

History

6/16/2013 - Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)