In this SQL minute, we’ll look at how to create a SQL server hierarchical query using the hierarchyid
data type. This is a two-part article. In Part one, you were introduced to the hierarchyid
type. In this article, we dig deeper into the hierarchyid
functions.
If you’re studying for the 70-761 exam, you’ll find there aren’t many intermediate SQL problems with explanations for you to solve. To help you out, I’ve put together my SQL Practice Workbook. When you subscribe to receive it, you’ll not only get my workbook with video answers, but an engaging problem to solve sent to your inbox each week.
For this problem, we will use the AdventureWorks
database.
Solving the Request using a SQL Server Hierarchical Query
AdventureWorks
is concerned their manager’s span-of-control, that is how many employees report to them, is out of whack. To address this, the HR director would like a report that displays the job title, employee full name, their boss’ full name, and how many people, including themselves are in their organization.
Here is an example of what she is looking to receive from you:
SQL Server Hierarchical Query Results
We’ll tackle this query in steps.
Step 1 – Start Small
From our last problem, you should be familiar with the hierarchyid
SQL type and its use to represent a tree like structure and it provides several functions, such as GetAncestor
, to navigate hierarchies, such as org structures.
To start with this query, lets get the employee and their name. To do this, we’ll use an INNER JOIN
:
SELECT E.JobTitle,
P.FirstName + ' ' + P.LastName as FullName
FROM HumanResources.Employee E
INNER JOIN Person.Person P
ON P.BusinessEntityID = E.BusinessEntityID
Step 2 – Build Up Success
Now let’s build on that success and go grab the “report to” column values. This is a bit more complicated, as we need to introduce a self-join on the employee
table to get the employee
’s boss
. In SQL server hierarchical query speak, this is the ancestor.
As you can see from the diagram, the top of the tree is the root, for the Server Manager, bosses are ancestors, and bosses’ direct reports descendants. Another way to look at the tree is to think of it as a family tree, where there are parents and children.
So, back to getting the Report To name… to do this, we’ll be joining to employee
, and then again to person
to get the name
.
Here is a table diagram showing the relationships:
The tables are aliased: P
and E
are the person
and employee
table from our original query. R
is the alias for the employee
table to get the boss
, and RP
is an alias of the person
table so we can match the boss to their name.
SELECT E.JobTitle,
P.FirstName + ' ' + P.LastName as FullName,
RP.FirstName + ' ' + RP.LastName as ReportsTo
FROM HumanResources.Employee E
INNER JOIN Person.Person P
ON P.BusinessEntityID = E.BusinessEntityID
INNER JOIN HumanResources.Employee R
ON R.OrganizationNode = E.OrganizationNode.GetAncestor(1)
INNER JOIN Person.Person RP
ON RP.BusinessEntityID = R.BusinessEntityID
ORDER BY E.OrganizationNode
We’re getting closer! We do not have a way to get an employee
’s boss
, but check out the results, there is a small problem. The CEO is missing!
The problem with this query is that the CEO doesn’t report to anyone, so they aren’t included… we need to fix this.
Step 3 – Include CEO in SQL Server Hierarchical Query
The issue with the query is the GetAncestor function doesn’t return a value for the top-most person in the tree. Given this, our join
condition can’t match. To fix this, we can add another join condition which specifically addresses the CEO.
Keep in mind when working with join
s that your join
conditions can include Boolean logic. We’re going to rewrite our join
so it matches a record from aliased table R
, if either of the following conditions are met:
- The employee’s ancestor
OrganizationNode
matches another OrganizationNode
. This is the existing join
condition. - The
employee
’s OrganizationNode
is the Root
and OrganizationNode
matches another. This is the CEO, as the root node is the topmost node, just like the CEO is the top person in an organization.
Here is the modified query, note the OR
operator within the second join
condition. Also, note how I enclose this second join
condition, the one to handle the CEO, in parenthesis.
SELECT E.JobTitle,
P.FirstName + ' ' + P.LastName as FullName,
RP.FirstName + ' ' + RP.LastName as ReportsTo
FROM HumanResources.Employee E
INNER JOIN Person.Person P
ON P.BusinessEntityID = E.BusinessEntityID
INNER JOIN HumanResources.Employee R
ON R.OrganizationNode = E.OrganizationNode.GetAncestor(1) OR
(E.OrganizationNode = hierarchyid::GetRoot() AND
R.OrganizationNode = E.OrganizationNode)
INNER JOIN Person.Person RP
ON RP.BusinessEntityID = R.BusinessEntityID
ORDER BY E.OrganizationNode
Address Org Size using Hierarchyid Functions
The last step to complete is adding the org
size to our results. At first, you may think you’ll need to write some sort of recursive CTE to traverse the tree and count up reports, but once you think about it, you’ll realize you can simply test whether an employee
’s OrganizationNode
is a descendant of the current employee
s.
This becomes a simple job, as you can use the IsDecendantOf
function and a correlated subquery to count rows where this condition is met. The subquery is correlated, as part of its criteria includes a reference to the outer query.
Here is an example of the completed SQL server hierarchical query.
SELECT E.JobTitle,
P.FirstName + ' ' + P.LastName as FullName,
RP.FirstName + ' ' + RP.LastName as ReportsTo,
(SELECT COUNT(1)
FROM HumanResources.Employee
WHERE OrganizationNode.IsDescendantOf(E.OrganizationNode) = 1) as OrgSize
FROM HumanResources.Employee E
INNER JOIN Person.Person P
ON P.BusinessEntityID = E.BusinessEntityID
INNER JOIN HumanResources.Employee R
ON R.OrganizationNode = E.OrganizationNode.GetAncestor(1) OR
(E.OrganizationNode = hierarchyid::GetRoot() AND
R.OrganizationNode = E.OrganizationNode)
INNER JOIN Person.Person RP
ON RP.BusinessEntityID = R.BusinessEntityID
ORDER BY E.OrganizationNode
Now that you have seen how to query tables that use the hierarchyid
type, what other kinds of problems can you see solving with this type’s built-in functions?
Also, please keep in mind, depending on your database size, you may be better off using a different type of design to represent your hierarchy.
If you have any comments regarding this post, or if you have any ideas on representing a tree, let us know in the comments below.
The post SQL Server Hierarchical Query using the hierarchyid type appeared first on Essential SQL.