Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server

SQL Server Hierarchical Query using the hierarchyid Type

5.00/5 (2 votes)
26 Apr 2019MIT4 min read 4.6K  
SQL Server Hierarchical Query using the hierarchyid type

In this SQL minute, we’ll look at how to create a SQL server hierarchical query using the hierarchyid data type. This is a two-part article. In Part one, you were introduced to the hierarchyid type. In this article, we dig deeper into the hierarchyid functions.

If you’re studying for the 70-761 exam, you’ll find there aren’t many intermediate SQL problems with explanations for you to solve. To help you out, I’ve put together my SQL Practice Workbook. When you subscribe to receive it, you’ll not only get my workbook with video answers, but an engaging problem to solve sent to your inbox each week.

For this problem, we will use the AdventureWorks database.

Solving the Request using a SQL Server Hierarchical Query

AdventureWorks is concerned their manager’s span-of-control, that is how many employees report to them, is out of whack. To address this, the HR director would like a report that displays the job title, employee full name, their boss’ full name, and how many people, including themselves are in their organization.

Here is an example of what she is looking to receive from you:

SQL Server Hierarchical Query Results

SQL Server Hierarchical Query Results

We’ll tackle this query in steps.

Step 1 – Start Small

From our last problem, you should be familiar with the hierarchyid SQL type and its use to represent a tree like structure and it provides several functions, such as GetAncestor, to navigate hierarchies, such as org structures.

To start with this query, lets get the employee and their name. To do this, we’ll use an INNER JOIN:

SQL
SELECT E.JobTitle, 
        P.FirstName + ' ' + P.LastName as FullName
 FROM   HumanResources.Employee E
        INNER JOIN Person.Person P 
        ON P.BusinessEntityID = E.BusinessEntityID

Step 2 – Build Up Success

Now let’s build on that success and go grab the “report to” column values. This is a bit more complicated, as we need to introduce a self-join on the employee table to get the employee’s boss. In SQL server hierarchical query speak, this is the ancestor.

Image 2

As you can see from the diagram, the top of the tree is the root, for the Server Manager, bosses are ancestors, and bosses’ direct reports descendants. Another way to look at the tree is to think of it as a family tree, where there are parents and children.

So, back to getting the Report To name… to do this, we’ll be joining to employee, and then again to person to get the name.

Here is a table diagram showing the relationships:

SQL Query Table Relationships

The tables are aliased: P and E are the person and employee table from our original query. R is the alias for the employee table to get the boss, and RP is an alias of the person table so we can match the boss to their name.

SQL
SELECT E.JobTitle, 
        P.FirstName + ' ' + P.LastName as FullName,
        RP.FirstName + ' ' + RP.LastName as ReportsTo
 FROM   HumanResources.Employee E
        INNER JOIN Person.Person P
        ON P.BusinessEntityID = E.BusinessEntityID
        INNER JOIN HumanResources.Employee R
        ON R.OrganizationNode = E.OrganizationNode.GetAncestor(1) 
        INNER JOIN Person.Person RP
        ON RP.BusinessEntityID = R.BusinessEntityID
 ORDER BY E.OrganizationNode

We’re getting closer! We do not have a way to get an employee’s boss, but check out the results, there is a small problem. The CEO is missing!

The problem with this query is that the CEO doesn’t report to anyone, so they aren’t included… we need to fix this.

Step 3 – Include CEO in SQL Server Hierarchical Query

The issue with the query is the GetAncestor function doesn’t return a value for the top-most person in the tree. Given this, our join condition can’t match. To fix this, we can add another join condition which specifically addresses the CEO.

Keep in mind when working with joins that your join conditions can include Boolean logic. We’re going to rewrite our join so it matches a record from aliased table R, if either of the following conditions are met:

  • The employee’s ancestor OrganizationNode matches another OrganizationNode. This is the existing join condition.
  • The employee’s OrganizationNode is the Root and OrganizationNode matches another. This is the CEO, as the root node is the topmost node, just like the CEO is the top person in an organization.

Here is the modified query, note the OR operator within the second join condition. Also, note how I enclose this second join condition, the one to handle the CEO, in parenthesis.

SQL
SELECT E.JobTitle, 
        P.FirstName + ' ' + P.LastName as FullName,
        RP.FirstName + ' ' + RP.LastName as ReportsTo
 FROM   HumanResources.Employee E
        INNER JOIN Person.Person P 
            ON P.BusinessEntityID = E.BusinessEntityID
        INNER JOIN HumanResources.Employee R 
            ON R.OrganizationNode = E.OrganizationNode.GetAncestor(1) OR
          (E.OrganizationNode = hierarchyid::GetRoot() AND
           R.OrganizationNode = E.OrganizationNode)
        INNER JOIN Person.Person RP
            ON RP.BusinessEntityID = R.BusinessEntityID
 ORDER BY E.OrganizationNode

Address Org Size using Hierarchyid Functions

The last step to complete is adding the org size to our results. At first, you may think you’ll need to write some sort of recursive CTE to traverse the tree and count up reports, but once you think about it, you’ll realize you can simply test whether an employee’s OrganizationNode is a descendant of the current employees.

This becomes a simple job, as you can use the IsDecendantOf function and a correlated subquery to count rows where this condition is met. The subquery is correlated, as part of its criteria includes a reference to the outer query.

Here is an example of the completed SQL server hierarchical query.

SQL
SELECT E.JobTitle, 
        P.FirstName + ' ' + P.LastName as FullName,
        RP.FirstName + ' ' + RP.LastName as ReportsTo,
        (SELECT COUNT(1)
     FROM   HumanResources.Employee
     WHERE  OrganizationNode.IsDescendantOf(E.OrganizationNode) = 1) as OrgSize
 FROM   HumanResources.Employee E
        INNER JOIN Person.Person P
        ON P.BusinessEntityID = E.BusinessEntityID
        INNER JOIN HumanResources.Employee R
        ON R.OrganizationNode = E.OrganizationNode.GetAncestor(1) OR
          (E.OrganizationNode = hierarchyid::GetRoot() AND
       R.OrganizationNode = E.OrganizationNode)
        INNER JOIN Person.Person RP
        ON RP.BusinessEntityID = R.BusinessEntityID
 ORDER BY E.OrganizationNode

Now that you have seen how to query tables that use the hierarchyid type, what other kinds of problems can you see solving with this type’s built-in functions?

Also, please keep in mind, depending on your database size, you may be better off using a different type of design to represent your hierarchy.

If you have any comments regarding this post, or if you have any ideas on representing a tree, let us know in the comments below.

The post SQL Server Hierarchical Query using the hierarchyid type appeared first on Essential SQL.

License

This article, along with any associated source code and files, is licensed under The MIT License