Introduction
Very often, while working with data, our requirement is to retrieve data that has been modified and keep it up to data in two tables synchronized. For achieving this task, SQL Server has provided ways such as CDC (Change Data Capture), rowversion
datatype, MERGE
statement, and various other alternatives using join
s, by storing datetime
, etc., which can be used based on our requirements. Here, we will discuss about the rowversion
datatype and see a small demo of how we can use rowversion
to sync data changes done in a table to another table.
About rowversion
rowversion
is a datatype which exposes an eight byte binary number, unique within a database. It is generally used for version stamping a table row. It means the value of the column of datatype rowversion
of a particular row gets changed whenever value of any column of that row gets changed (updated). When a new insertion is made in the table, the rowversion
datatype column automatically gets populated with a unique value.
How to Create a Table with the rowversion Data Type?
CREATE TABLE Product (
ProductID INT IDENTITY PRIMARY KEY
, ProductName VARCHAR(100) NOT NULL
, ProductCatagory VARCHAR(20) NOT NULL
, ManufacturedBy VARCHAR(100) NOT NULL
, ProductKey rowversion
)
Here, the point to note is that, only one column of the rowversion
datatype is allowed per table. So, if we try to create another column of the same datatype ProductKey1
by un-commenting it, we will end up with the error: "A table can only have one timestamp column. Because table 'Product' already has one, the column 'ProductKey1' cannot be added."
Insertion and Update
Insertion and update on the rowversion
datatype column is not allowed. If you try, you will get the below error.
INSERT INTO Product(ProductName , ProductCatagory, ManufacturedBy, ProductKey)
VALUES ('Keyboard','Peripheral','Microsoft',CONVERT(BINARY(8),GETDATE()))
UPDATE Product SET ProductKey= '0x000000000E1EB636' WHERE ProductID = 1
If you notice, every time SQL Server is generating an error, it is referring to the rowversion
datatype as the timestamp. This is because rowversion
is a synonym for timestamp
. In ANSI SQL, timestamp
is a data type for date and time and SQL Server doesn't track an actual time that can be associated with a clock date and time, but represents relative time within a database. Microsoft decided to deprecate timestamp
syntax in future versions and provided a synonym rowversion
.
How Does It Work?
MSDN states that every database has a counter referred to as database rowversion
. This counter gets incremented for each insert
or update
operation that is performed on a table that contains a rowversion
column within the database and this value is inserted or updated in the rowversion
column of the table. If at any point of time you need to check the value of this counter, use this:
SELECT @@DBTS;
Demo
Apart from the uses explained on MSDN, we can also use this to track changes in one table and update the changes in another table. Let's see a small demonstration.
Before we start, please read the below code written for illustration purposes only. Now let's say we have two tables: Product
(created above) and PurchaseOrder
. Somehow, our requirement is to keep update data of columns [ProductId
], [ProductName
], and [ProductCategory
] of Product
table into the PurchaseOrder
table along with some other data. We can do this by creating a column [ProductKey
] of rowversion
datatype in the Product
table, so that we can find out which rows have been changed from the last time we read data from it and a column of BINARY
datatype in PurchaseOrder
table to store the value of the [ProductKey
] column of the Product
table.
Let's create a PurchaseOrder
table, insert some test data in the Product
table, and also in the PurchaseOrder
table.
CREATE TABLE PurchaseOrder (
ProductId INT FOREIGN KEY REFERENCES Product(ProductId)
, ProductName VARCHAR(100) NOT NULL
, ProductCatagory VARCHAR(20) NOT NULL
, Price DECIMAL NOT NULL
, Quantity INT
, ProductKey BINARY(8) )
INSERT INTO Product(ProductName , ProductCatagory, ManufacturedBy)
VALUES ('Keyboard','Peripheral','Microsoft')
, ('Mouse','Peripheral','Microsoft')
, ('Headphone','Peripheral','Microsoft')
INSERT INTO PurchaseOrder
(ProductId , ProductName , ProductCatagory , Price , Quantity , ProductKey)
SELECT ProductId , ProductName , ProductCatagory , _
ProductId * 100 , ProductId*2 , ProductKey FROM Product
By the time you execute the SELECT
statement on both tables, you will find both tables are in sync. Now let's change data in the Product
table.
UPDATE Product
SET ProductName = 'WireLess Keyboad'
, ProductCatagory ='Keyboad'
OUTPUT deleted.ProductKey 'Old Prod Key'
WHERE ProductName = 'Keyboard'
You can use the below query to check which data has been changed.
SELECT P.* FROM Product P
JOIN PurchaseOrder PO
ON P.ProductID = PO.ProductId
AND P.ProductKey <> PO.ProductKey
For synchronization, you can use either of the following ways:
UPDATE PO
SET ProductName= P.ProductName
, ProductCatagory = P.ProductCatagory
, ProductKey = P.ProductKey
FROM Product P
JOIN PurchaseOrder PO
ON P.ProductID = PO.ProductId
AND P.ProductKey <> PO.ProductKey
MERGE INTO PurchaseOrder TargetTable
USING Product SourceTable
ON TargetTable.ProductId = SourceTable.ProductId
WHEN MATCHED
AND TargetTable.ProductKey <> SourceTable.ProductKey
THEN
UPDATE SET ProductName= SourceTable.ProductName
, ProductCatagory = SourceTable.ProductCatagory
, ProductKey = SourceTable.ProductKey ;
- Using
JOIN
- Using
MERGE
statement
Points to Remember
Since data of a rowversion
column gets changed, if any update occurs on that row, it is a very poor candidate for keys. If it is a key then, even an update on a non key column will also generate an index update. We should not make it the Primary Key as well because of the index update issue explained above and also referencing a foreign key can lose their values which they are referring to. The value of rowversion
gets changed even for false updates. See the result of the below code, we are updating ProductName
with the same value 'Keyboard
', which is its current value, even then the value of ProductKey
[rowversion
DataType] columns get changed.
References
- More about rowversion datatype
- More about OUTPUT clause
- More about MERGE statement
Happy programming!