Introduction
Representing organizational structures in an SQL Table can be tricky. One way to do this is use a self referencing key in a table that identifies the parent-child relationship.
Can be referenced as:
Self Referencing Data Structure
Id | RefId | Name |
1 | null | Root |
2 | 1 | Customer1 |
3 | 1 | Customer2 |
4 | 1 | Customer3 |
5 | 2 | Dept 1.1 |
6 | 2 | Dept 1.2 |
7 | 2 | Dept 1.3 |
8 | 3 | Dept 2.1 |
9 | 4 | Dept 3.1 |
10 | 4 | Dept 3.2 |
As you can see, the RefId
points to the id of each parent. Through this data structure, we can identify all of the descendants for the parent. For example, the children of id 4 would be ids 9 and 10.
This article will also introduce a table valued function that will provide a list of descendant ids for a given id.
Background
To set this up, a single table must be created and populated.
CREATE TABLE [dbo].[Organizations](
[id] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](500) NOT NULL,
[ParentId] [int] NULL,
CONSTRAINT [PK_Organizations] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_Organizations_ParentId] ON [dbo].[Organizations]
(
[ParentId] ASC
)WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF,
ONLINE = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
set identity_insert [dbo].[Organizations] on;
insert into dbo.Organizations ([id], [ParentId], [Name]) values
(1 ,null, 'Root '),
(2 ,1,'Customer1 '),
(3 ,1,'Customer2 '),
(4 ,1,'Customer3 '),
(5 ,2,'Dept 1.1 '),
(6 ,2,'Dept 1.2 '),
(7 ,2,'Dept 1.3 '),
(8 ,3,'Dept 2.1 '),
(9 ,4,'Dept 3.1 '),
(10 ,4,'Dept 3.2 ')
set identity_insert [dbo].[Organizations] off;
As the records are added, the ParentId
with take the value of the Parent Records Id. All three customer's ParentId
s reference the root id (1). This creates the hierarchy we can use to recursively select the child records.
Selecting the Children
The query that is used to select the children is pretty complicated. It is using recursion (calling itself) to identify each of the children of the parent. This example uses a CTE (Common Table Expression) to simplify the query.
;with recur as (
select a.id, a.parentid from organizations a
where a.id=2
union all
select b.id, b.parentid from Organizations b
inner join recur c on c.id = b.parentid
)
select id from recur
This code would select each of the record ids that are descendant from the record with id 2.
id
-----------
2
5
6
7
This is the simplest form of the query and could easily be extended to present more values than just the id. However, it is trimmed down to suit the next step - creating a table valued function.
Table Valued Functions
Table valued functions are SQL functions that return a table. For example, a table value function called dbo.GetDecendants
with an input parameter of id could be expressed as:
Select * from dbo.GetDecendants(3)
A table value function is set up to return its results as a table. Therefore, it can be used to return a set of integers (ids) that can be used in a more complex select
clause.
select id, parentid, substring(name,1,12) from dbo.Organizations
where id in (select * from dbo.GetDecendants(2))
Where the results of this query would be:
id parentid
----------- ----------- ------------
2 1 Customer1
5 2 Dept 1.1
6 2 Dept 1.2
7 2 Dept 1.3
(4 rows affected)
Here is the code to create the GetDecendants
function:
create FUNCTION GetDecendants
(
@id int
)
RETURNS TABLE
AS
RETURN
(
with recur as (
select a.id, a.parentid from organizations a
where a.id=@id
union all
select b.id, b.parentid from Organizations b
inner join recur c on c.id = b.parentid
)
select id from recur
)
GO
Points of Interest
This data structure is useful for cascading configuration values where the base configuration is set at the parent and each level of descendant fine tunes the configuration for the associated data object. In other words, the configuration values for the descendant would override the values set for the parent.
This could also be used as an index table that identifies the parents and children of an object and detail tables are joined to this id.
At the company, I was introduced to this data structure, it was used to identify hot spots in a hospital's performance. For example, if the hierarchy represents a hospital, floor, wing, room and the metrics were applied to the room, this pattern could be used to aggregate metrics for each section of the hospital.
History
- 7th January, 2020: Initial draft