Introduction to Common Table Expressions (CTEs)

essentialSQL

4.80/5 (6 votes)

9 Jul 2016MIT5 min read

8.8K

Introduction to Common Table Expressions

Common Table Expressions or CTEs for short are used within SQL Server to simplify complex joins and subqueries, and to provide a means to query hierarchical data such as an organizational chart.

In this set of articles, we’ll introduce you to CTEs, the two types, and their uses. In this article, we’ll introduce CTEs. Once you’re familiar, I would encourage you to also read these articles as well:

Introduction to Common Table Expressions

A CTE (Common Table Expression) is a temporary result set that you can reference within another SELECT, INSERT, UPDATE, or DELETE statement. They were introduced in SQL Server version 2005. They are SQL compliant and part of the ANSI SQL 99 specification.

A CTE always returns a result set. They are used to simplify queries, for example, you could use one to eliminate a derived table from the main query body:

Note: All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012 database. You can get started using these free tools using my Guide Getting Started Using SQL Server.

What is a CTE or Common Table Expression in SQL Server?

A CTE (Common Table Expression) defines a temporary result set which you can then use in a SELECT statement. It becomes a convenient way to manage complicated queries.

Common Table Expressions are defined within the statement using the WITH operator. You can define one or more common table expressions in this fashion.

Here is a really simple example of one CTE:

SQL

WITH Employee_CTE (EmployeeNumber, Title)
AS
(SELECT NationalIDNumber,
        JobTitle
 FROM   HumanResources.Employee)
SELECT EmployeeNumber,
       Title
FROM   Employee_CTE

Let’s break this down a bit.

CTE Query Definition

The blue portion is the CTE. Notice it contains a query that can be run on its own in SQL. This is called the CTE query definition:

SQL

SELECT NationalIDNumber,
       JobTitle
FROM   HumanResources.Employee

When you run it, you see results like:

CTE Query Definition Results

Notice that when we define the CTE, we give the result a name as well its columns. In this way, a CTE acts like a VIEW. The result and columns are named differently. This allows you to encapsulate complicated query logic with the common table expression.

Now going back to the CTE, notice the WITH statement. There you’ll see the name and columns are defined. These columns correspond to the columns returned from the inner query.

CTE Query Definition Column Mappings

Finally, notice that our final query references the CTE and columns defined.

From our outer query’s perspective, all it “sees” is this definition. It isn’t concerned with how the CTE is constructed, just its name and columns.

As such, the results from the CTE are:

Notice the column names, they’re based on those defined in the CTE.

I want to point out that you can define more than one CTE within a WITH statement. This can help you simplify some very complicated queries which are ultimately joined together. Each complicated piece can include in their own CTE which is then referred to and joined outside the WITH clause.

Here is an example using of TWO CTEs, it’s a simple example, but it shows how two CTEs are defined, and then used in an INNER JOIN.

SQL

WITH   <span style="color: #339966;">PersonCTE (BusinessEntityID, FirstName, LastName)</span>
<span style="color: #339966;">AS     (SELECT Person.BusinessEntityID,</span>
<span style="color: #339966;">               FirstName,</span>
<span style="color: #339966;">               LastName</span>
<span style="color: #339966;">        FROM   Person.Person</span>
<span style="color: #339966;">        WHERE  LastName LIKE 'C%')</span>,
<span style="color: #0000ff;">PhoneCTE (BusinessEntityID, PhoneNumber)</span>
<span style="color: #0000ff;">AS     (SELECT BusinessEntityID,</span>
<span style="color: #0000ff;">               PhoneNumber</span>
<span style="color: #0000ff;">        FROM   Person.PersonPhone)</span>
SELECT FirstName,
       LastName,
       PhoneNumber
FROM   <span style="color: #339966;">PersonCTE</span>
INNER JOIN
<span style="color: #0000ff;">PhoneCTE</span>
ON PersonCTE.BusinessEntityID = PhoneCTE.BusinessEntityID;

The first common table expression is colored green, the second blue. As you can see from the SELECT statement, the CTEs are joined as if they were tables. Hopefully, you can see that as your queries become more complicated, CTEs can become a really useful way to separate operations; therefore, simplify your final query.

Why Do You Need CTEs?

There are several reasons why you may want to use a CTE over other methods. Some of them include:

Readability – CTEs promote readability. Rather than lump all your query logic into one large query, create several CTEs, which are then combined later in the statement. This lets you get the chunks of data you need and combine them in a final SELECT.
Substitute for a View – You can substitute a CTE for a view. This is handy if you don’t have permissions to create a view object or you don’t want to create one as it is only used in this one query.
Recursion – Use CTEs to create recursive queries, that is queries that can call themselves. This is handy when you need to work on hierarchical data such as organization charts.
Limitations – Overcome SELECT statement limitations, such as referencing itself (recursion), or performing GROUP BY using non-deterministic functions.
Ranking – Whenever you want to use ranking function such as ROW_NUMBER(), RANK(), NTILE(), etc.

Types of CTEs

Common Table Expressions can be placed into two broad categories: Recursive CTEs and Non-Recursive CTEs.

Recursive CTEs are common table expressions that reference themselves. Recursion can be a pretty difficult topic to grasp, I really didn’t get it until I took a LISP class way back in 1986, but hopefully I can explain it to you.

We’ll go deep into recursion in a separate post, but for now let me introduce to you recursion using this diagram:

Here, you see a picture of opposing mirrors. Due to the reflection, it becomes a picture in a picture.

Recursive queries are like that.

As a recursive query is run, it repeatedly runs on a subset of the data. A recursive query is basically a query that calls itself. At some point, there is an end condition, so it doesn’t call itself indefinitely.

In a way, when you look into the picture, you can imagine each picture in a picture is the picture calling itself. However, unlike the “infinite reflection” in the mirrors, there comes a point where a recursive query encounters the end condition and stop calling itself.

At that point, the recursion starts to unwind, collect and calculate data as it reviews each successive result.

Non-Recursive CTEs, as the name implies, are common table expressions that don’t use recursion. They don’t reference themselves. They are easier to understand so we’ll look at them first in detail in the next article within this series.

Conclusion

Hopefully, you now have an appreciation of what CTEs are and why we may want to use them. In the next two article, we’ll go into much greater detail on CTEs, and when to use them.

Until then, I would recommend reviewing my articles on joins and subqueries as we’ll draw upon these types of queries for our examples.

The post Introduction to Common Table Expressions (CTEs) appeared first on Essential SQL.

License

This article, along with any associated source code and files, is licensed under The MIT License