Introduction
Computed columns are type of columns in which the values are derived based on one or more other columns. Hence the data type on the computed column depends on the result of the derived column values.
Computed columns is a feature which has been there in SQL Server since version 2000. But in my experience, I feel that it has been a feature which has been used less compared to many other features available, and during discussions and interviews, this is something which most developers slip or fail to answer.
Why Do We Need Computed Columns?
First, we will consider a case where we need to store details on a table without the usage of computed columns.
Consider we have a table which contains employee
details. We have two columns to store employee
’s first and last names. But we also required to have a column which we need to store their full name as well by concatenating the first and last names. So the correct way is to have the third column which contains the full name and the data needs to be inserted while the employee record is created and it should be maintained in the case where the details are updated as well. Otherwise, the data integrity will be lost. (One might debate that the full name can be built from the business logic code using the first and last names. But for illustration purposes, we would consider that we are maintaining it using SQL Server.)
CREATE TABLE dbo.Employee(
Id INT
,FirstName VARCHAR(30)
,LastName VARCHAR(30)
,FullName VARCHAR(61)
)
However, we could achieve the same with the use of a computed column and with less effort compared to the first approach.
CREATE TABLE dbo.Employee(
Id INT
,FirstName VARCHAR(30)
,LastName VARCHAR(30)
,FullName AS CONCAT(FirstName,' ',LastName)
)
Let’s insert few records to the table which we created now:
INSERT INTO dbo.Employee(Id, FirstName, LastName)
VALUES (1,’John’,’Doe'),(2,’Jane’,’Doe')
PERSISTED, DETERMINISTIC or NON-DETERMINISTIC ?
The values reflected on computed column can be either deterministic or persisted.
When the values are deterministic or non-deterministic, the value in the column will not be saved on to the table physically. Instead, it is always calculated during the query execution. Hence the value could differ based on the functions you use in the formula. E.g.: If you use GETDATE()
in the calculated column, it will always return a different value during each execution.
CREATE TABLE dbo.Employee2(
Id INT
,FirstName VARCHAR(30)
,LastName VARCHAR(30)
,CreatedDate AS GETDATE()
)
INSERT INTO dbo.Employee2(Id, FirstName, LastName)
VALUES (1,'John','Doe')
And when queried, the calculated column returns different values as shown below:
**Note: The above mentioned can be achieved using a default constraint as well. I have used that example on strictly illustration basis.
You can further read on deterministic and non-deterministic function on the following Microsoft documentation:
Computed column values can be persisted by adding the keyword PERSISTED
when the column is created using T-SQL or by the table designer in SSMS. We will drop ‘FullName
’ column and recreate the column.
ALTER TABLE dbo.Employee DROP COLUMN FullName;
ALTER TABLE dbo.Employee
ADD FullName AS CONCAT(FirstName,' ',LastName) PERSISTED;
**Note: If you try to drop the ‘CreatedDate
’ column on Employee2
and try to create it as PERSISTED
, it will throw an error. Because computed columns can only be persisted when it’s deterministic.
Msg 4936, Level 16, State 1, Line 45
Computed column 'CreatedDate' in table 'Employee2' cannot be persisted
because the column is non-deterministic.
Now when the expression is evaluated during the execution, the ‘FullName
’ will be saved into the table.
The data is read-only to the developer and it’s maintained by the engine. When the data is changed on the columns which was used in the formula, the computed values will be changed.