What should be considered when NOLOCK hint is used

Wendelius

4.97/5 (17 votes)

12 Jul 2016CPOL3 min read

55.1K

When considering to use NOLOCK hint, there are a few things that should be kept in mind.

Download nolock_script.zip - 811 B

Introduction

With SQL Server, the NOLOCK hint is very tempting especially with SELECT statements where at least one of the tables is frequently updated. Using NOLOCK removes all queuing problems and the SELECT statement runs smoothly, but not without side effects.

Why do we need a hint at all

The first question might be that why do we need to add this hint to get the statement running without additional waits. The answer lies on a basic database management system principle: ACID (atomicity, consistency, isolation, and durability). The property isolation defines that other transactions may not see incomplete modifications. Since SQL Server (when the default locking scenario is used) uses exclusive locks which prevent acquiring read (shared) locks, all reading operations must wait until exclusive locks are released. This causes the SELECT statement to wait if the data to be fetched is locked exclusively.

Using the NOLOCK hint (which means exactly the same as READUNCOMMITTED) actually bypasses the check for exclusive locks and does not set shared locks at all. So what it also means is that the query sees data that is not yet committed to the database which may result in a dirty read. This means that the modifications are not isolated so one of the main database principles is ignored. This is why NOLOCK -like behaviour isn't used by default.

What it actually means

Let's take an example. First let's create the necessary objects.

SQL

----------------------------------
-- Create test objects
----------------------------------
-- Schema
CREATE SCHEMA LockTest;
GO

-- OrderEntry -table
IF OBJECT_ID ( 'LockTest.OrderEntry', 'U' ) IS NOT NULL 
DROP TABLE LockTest.OrderEntry;
GO

CREATE TABLE LockTest.OrderEntry (
Id        int     not null identity(1,1) primary key,
Amount    int     not null,
UnitPrice decimal not null,
Area      int     not null
);
GO

-- AddOrders -procedure
IF OBJECT_ID ( 'LockTest.AddOrders', 'P' ) IS NOT NULL 
DROP PROCEDURE LockTest.AddOrders;
GO

CREATE PROCEDURE LockTest.AddOrders @OrderCount int AS
BEGIN
   DECLARE @counter int = 0;

   WHILE (@counter < @OrderCount) BEGIN
      INSERT INTO LockTest.OrderEntry
         (Amount, UnitPrice, Area) 
      VALUES 
         (ROUND(RAND()*100, 0), ROUND(RAND()*100, 2), ROUND(RAND()*10, 0));

      SET @counter = @counter + 1;
   END;
END;
GO

So now we have a single table and a procedure to create some random data. Let's add something to the table.

SQL

-- Add some data to the table
BEGIN TRANSACTION;
EXEC LockTest.AddOrders @OrderCount=100;
COMMIT;

Now if we get the sales amounts per area, the result is something like the following. Note that your results are different due to the random values.

SQL

-- Get the sales grouped by area
SELECT TOP(3)
Area, SUM(Amount * UnitPrice) AS TotalSales
FROM LockTest.OrderEntry
GROUP BY Area
ORDER BY Area;

And the results:

Area  TotalSales
----  ----------
0     10454
1     22608
2     22038

Simultaneous transactions, normal behaviour

Now if the database is updated and queried at the same time, what happens in the normal situation? The DML operation takes the necessary locks and possibly prevents the query from finishing. This can be tested with the following. The test requires that you have two simultaneous connections to the database.

In session 1, add new rows in a transaction, leave the transaction open.

SQL

-- Session 1: Add data to the table and leave the transaction open
BEGIN TRANSACTION
EXEC LockTest.AddOrders @OrderCount=10000;

And using session 2, query again for the top 3 areas. This query won't finish but remains running and waiting for the locks to be released.

SQL

-- Session 2: Get the sales grouped by area
-- won't return until transaction in session 1 is ended
SELECT TOP(3)
Area, SUM(Amount * UnitPrice) AS TotalSales
FROM LockTest.OrderEntry
GROUP BY Area
ORDER BY Area;

Now when session 1 rolls back, the transaction locks are released and the query in session 2 continues.

SQL

-- Session 1: roll back the modifications
ROLLBACK;

And immediately the results are displayed for session 2.

Area  TotalSales
----  ----------
0     10454
1     22608
2     22038

As you see, the results are the same as previously since all the modifications were rolled back.

Results when NOLOCK is used

Now we use the exact same scenario as previously, but this time with the NOLOCK hint.

In session 1, again add new rows in a transaction, and leave the transaction open.

SQL

-- Session 1: Add data to the table and leave the transaction open
BEGIN TRANSACTION
EXEC LockTest.AddOrders @OrderCount=10000;

And using session 2, query again for the top 3 areas, but this time with the NOLOCK hint.

SQL

 -- Session 2: Get the sales grouped by area WITH NOLOCK
SELECT TOP(3)
Area, SUM(Amount * UnitPrice) AS TotalSales
FROM LockTest.OrderEntry WITH (NOLOCK)
GROUP BY Area
ORDER BY Area;

Results are shown immediately (again your results may differ because of the random values).

Area    TotalSales
----    ----------
0       1321810
1       2417946
2       2539965

Now the session 1 rolls back the transaction.

SQL

-- Session 1: roll back the modifications
ROLLBACK;

And the exact same query is executed in session 2.

SQL

 -- Session 2: Get the sales grouped by area WITH NOLOCK
SELECT TOP(3)
Area, SUM(Amount * UnitPrice) AS TotalSales
FROM LockTest.OrderEntry WITH (NOLOCK)
GROUP BY Area
ORDER BY Area;

The results are original since session 1 rolled back the whole thing.

Area  TotalSales
----  ----------
0     10454
1     22608
2     22038

Conclusions

What we saw was a dirty read. For a moment, we saw data from the database which (in a sense of isolation) wasn't persistent. So is this a bad thing? Well, it depends on the business logic. Imagine that you're creating a banking application, I'd bet that dirty read is something you really want to avoid. On the other hand, if you typically don't have rollbacks and if you do, and the false data you possibly see doesn't have a negative impact, then you can consider using NOLOCK.

History

February 11, 2011: Created.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)