Introduction
It’s a headache for developers to write or read a complex SQL query using a number of Join
s. Complex SQL statements can be made easier to understand and maintainable in the form of CTE or Common Table expressions. In this post, I will try to cover some of the benefits that we get when working with CTE.
When dealing with sub-queries, it is often required that you need to select a part of the data from a sub query or even join data from a query with some other tables. In that case, either you have an option to name your sub-queries with an alias or to use it directly. Gradually your requirement is getting more and more complex and your query would look unmaintainable at any time. CTE allows you to define the subquery at once, name it using an alias and later call the same data using the alias just like what you do with a normal table. CTE is standard ANSI SQL standard.
Using the Code
Say, for instance, you have a query like this:
SELECT * FROM (
SELECT A.Address, E.Name, E.Age From Address A
Inner join Employee E on E.EID = A.EID) T
WHERE T.Age > 50
ORDER BY T.NAME
The query looks really a mess. Even if I need to write something that wraps around the entire query, it would gradually become unreadable. CTE allows you to generate Tables beforehand and use it later when we actually bind the data into the output.
Rewriting the query using CTE expressions would look like:
With T(Address, Name, Age)
AS
(
SELECT A.Address, E.Name, E.Age from Address A
INNER JOIN EMP E ON E.EID = A.EID
)
SELECT * FROM T
WHERE T.Age > 50
ORDER BY T.NAME
Yes as you can see, the second query is much more readable using CTE. You can specify as many query expressions as you want and the final query which will output the data to the external environment will eventually get reference to all of them.
With T1(Address, Name, Age)
AS
(
SELECT A.Address, E.Name, E.Age from Address A
INNER JOIN EMP E ON E.EID = A.EID
),
T2(Name, Desig)
AS
(
SELECT NAME, DESIG FROM Designation)
SELECT T1.*, T2.Desig FROM T1
WHERE T1.Age > 50 AND T1.Name = T2.Name
ORDER BY T1.NAME
So the queries are separated using commas. So basically you can pass as many queries as you want and these queries will act as a subqueries, getting you the data and name it as a temporary table in the query.
According to the syntax, the CTE starts with a With
clause. You can specify the column names in braces, but it is not mandatory.
Common Table Expression Syntax
A Common Table Expression contains three core parts:
- The CTE name (this is what follows the
WITH
keyword) - The column list (optional)
- The query (appears within parentheses after the
AS
keyword)
The query using the CTE must be the first query appearing after the CTE.
When to Use Common Table Expressions
Common Table Expressions offer the same functionality as a view, but are ideal for one-off usages where you don't necessarily need a view defined for the system. Even when a CTE is not necessarily needed, it can improve readability. In Using Common Table Expressions, Microsoft offers the following four advantages of CTEs:
- Create a recursive query.
- Substitute for a view when the general use of a view is not required; that is, you do not have to store the definition in metadata.
- Enable grouping by a column that is derived from a scalar subselect, or a function that is either not deterministic or has external access.
- Reference the resulting table multiple times in the same statement.
Using a CTE offers the advantages of improved readability and ease in maintenance of complex queries. The query can be divided into separate, simple, logical building blocks. These simple blocks can then be used to build more complex, interim CTEs until the final result set is generated.
Using scalar subqueries (such as the (SELECT COUNT(1) FROM ...
) examples we've looked at in this article) cannot be grouped or filtered directly in the containing query. Similarly, when using SQL Server 2005's ranking functions - ROW_NUMBER()
, RANK()
, DENSE_RANK()
, and so on - the containing query cannot include a filter or grouping expression to return only a subset of the ranked results. For both of these instances, CTEs are quite handy.
CTEs can also be used to recursively enumerate hierarchical data.
Points of Interest
See the awful series with CTE:
WITH ShowMessage(STATEMENT, LENGTH)
AS
(
SELECT STATEMENT = CAST('I Like ' AS VARCHAR(300)), LEN('I Like ')
UNION ALL
SELECT
CAST(STATEMENT + 'CodeProject! ' AS VARCHAR(300))
, LEN(STATEMENT) FROM ShowMessage
WHERE LENGTH < 300
)
SELECT STATEMENT, LENGTH FROM ShowMessage
So this will produce like this:
History
- 6th October, 2011: Initial version