Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server

Rowversion datatype in SQL Server: Track Which Rows Have Been Modified

4.43/5 (8 votes)
17 Dec 2013CPOL4 min read 40.1K  
SQL Server: Track which rows have been modified

Introduction

Very often, while working with data, our requirement is to retrieve data that has been modified and keep it up to data in two tables synchronized. For achieving this task, SQL Server has provided ways such as CDC (Change Data Capture), rowversion datatype, MERGE statement, and various other alternatives using joins, by storing datetime, etc., which can be used based on our requirements. Here, we will discuss about the rowversion datatype and see a small demo of how we can use rowversion to sync data changes done in a table to another table.

About rowversion

rowversion is a datatype which exposes an eight byte binary number, unique within a database. It is generally used for version stamping a table row. It means the value of the column of datatype rowversion of a particular row gets changed whenever value of any column of that row gets changed (updated). When a new insertion is made in the table, the rowversion datatype column automatically gets populated with a unique value.

How to Create a Table with the rowversion Data Type?

SQL
CREATE TABLE Product (
 ProductID INT IDENTITY PRIMARY KEY
, ProductName VARCHAR(100) NOT NULL
, ProductCatagory VARCHAR(20) NOT NULL
, ManufacturedBy VARCHAR(100) NOT NULL
, ProductKey rowversion
--,ProductKey1 rowversion
)

Here, the point to note is that, only one column of the rowversion datatype is allowed per table. So, if we try to create another column of the same datatype ProductKey1 by un-commenting it, we will end up with the error: "A table can only have one timestamp column. Because table 'Product' already has one, the column 'ProductKey1' cannot be added."

Insertion and Update

Insertion and update on the rowversion datatype column is not allowed. If you try, you will get the below error.

SQL
INSERT INTO Product(ProductName , ProductCatagory, ManufacturedBy, ProductKey)
VALUES ('Keyboard','Peripheral','Microsoft',CONVERT(BINARY(8),GETDATE()))

 --Msg 273, Level 16, State 1, Line 1 Cannot insert an explicit value into a timestamp column

UPDATE Product SET ProductKey= '0x000000000E1EB636' WHERE ProductID = 1

 --Msg 272, Level 16, State 1, Line, Cannot update a timestamp column.

If you notice, every time SQL Server is generating an error, it is referring to the rowversion datatype as the timestamp. This is because rowversion is a synonym for timestamp. In ANSI SQL, timestamp is a data type for date and time and SQL Server doesn't track an actual time that can be associated with a clock date and time, but represents relative time within a database. Microsoft decided to deprecate timestamp syntax in future versions and provided a synonym rowversion.

How Does It Work?

MSDN states that every database has a counter referred to as database rowversion. This counter gets incremented for each insert or update operation that is performed on a table that contains a rowversion column within the database and this value is inserted or updated in the rowversion column of the table. If at any point of time you need to check the value of this counter, use this:

SQL
SELECT @@DBTS;

Demo

Apart from the uses explained on MSDN, we can also use this to track changes in one table and update the changes in another table. Let's see a small demonstration.

Before we start, please read the below code written for illustration purposes only. Now let's say we have two tables: Product (created above) and PurchaseOrder. Somehow, our requirement is to keep update data of columns [ProductId], [ProductName], and [ProductCategory] of Product table into the PurchaseOrder table along with some other data. We can do this by creating a column [ProductKey] of rowversion datatype in the Product table, so that we can find out which rows have been changed from the last time we read data from it and a column of BINARY datatype in PurchaseOrder table to store the value of the [ProductKey] column of the Product table.

Let's create a PurchaseOrder table, insert some test data in the Product table, and also in the PurchaseOrder table.

SQL
CREATE TABLE PurchaseOrder (
 ProductId INT FOREIGN KEY REFERENCES Product(ProductId)
, ProductName VARCHAR(100) NOT NULL
, ProductCatagory VARCHAR(20) NOT NULL
, Price DECIMAL NOT NULL
, Quantity INT
, ProductKey BINARY(8) )

 -- Insert test Data in Product Table 
INSERT INTO Product(ProductName , ProductCatagory, ManufacturedBy)
VALUES ('Keyboard','Peripheral','Microsoft')
 , ('Mouse','Peripheral','Microsoft')
 , ('Headphone','Peripheral','Microsoft')

 -- Insert test data in PurchaseOrder Table 
INSERT INTO PurchaseOrder
 (ProductId , ProductName , ProductCatagory , Price , Quantity , ProductKey)
SELECT ProductId , ProductName , ProductCatagory , _
ProductId * 100 , ProductId*2 , ProductKey FROM Product

By the time you execute the SELECT statement on both tables, you will find both tables are in sync. Now let's change data in the Product table.

SQL
UPDATE Product
SET ProductName = 'WireLess Keyboad'
 , ProductCatagory ='Keyboad'
OUTPUT deleted.ProductKey 'Old Prod Key'
WHERE ProductName = 'Keyboard'

You can use the below query to check which data has been changed.

SQL
SELECT P.* FROM Product P
JOIN PurchaseOrder PO
  ON P.ProductID = PO.ProductId
  AND P.ProductKey <> PO.ProductKey

For synchronization, you can use either of the following ways:

SQL
UPDATE PO
    SET ProductName= P.ProductName
  , ProductCatagory = P.ProductCatagory
  , ProductKey = P.ProductKey
FROM Product P
JOIN PurchaseOrder PO
   ON P.ProductID = PO.ProductId
   AND P.ProductKey <> PO.ProductKey
SQL
MERGE INTO PurchaseOrder TargetTable
USING Product SourceTable  
ON TargetTable.ProductId = SourceTable.ProductId
WHEN MATCHED
      AND TargetTable.ProductKey <> SourceTable.ProductKey
THEN
      UPDATE SET ProductName= SourceTable.ProductName
     , ProductCatagory = SourceTable.ProductCatagory
     , ProductKey = SourceTable.ProductKey ;
  1. Using JOIN
  2. Using MERGE statement

Points to Remember

Since data of a rowversion column gets changed, if any update occurs on that row, it is a very poor candidate for keys. If it is a key then, even an update on a non key column will also generate an index update. We should not make it the Primary Key as well because of the index update issue explained above and also referencing a foreign key can lose their values which they are referring to. The value of rowversion gets changed even for false updates. See the result of the below code, we are updating ProductName with the same value 'Keyboard', which is its current value, even then the value of ProductKey [rowversion DataType] columns get changed.

References

  1. More about rowversion datatype
  2. More about OUTPUT clause
  3. More about MERGE statement

Happy programming!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)