Introduction
Tree structures are very useful in implementing hierarchical structures which are helpful for software developers to develop applications which are more realistic and tangible to the customers who will use them. For example, in implementing "departments structures" or "products tree" or "organization charts" with unknown level of depth tree nodes, it's inevitable to use these structures in database.
In this article, we will review stored procedures and functions needed to maintain a tree structure in a database. In this case, we have used Microsoft SQL Server 2008 as our DBMS. We will see how to List, Add, Edit, Delete, Move and Copy nodes in the tree with SQL and how to maintain our tree by removing orphan nodes. These procedures will help programmers to easily maintain tree structures in their Windows or web applications. After having these means, it's up to the programmer to provide a user friendly interface to work with tree structures.
Background
A tree structure consists of multiple nodes that every node has a parent node in such a way that we do not have any ring in our node relationships. We should have at least an Identification
field for each node and a pointer to its parent node. It's obvious that root nodes don't have any parent. We could add extra information to each node's data structure regarding the problem we are handling.
Using the Code
Creation
First of all, we should create a Tree
table in our database which is named TreeNodes
. We have an identification field and a title and a pointer to the parent identification.
CREATE TABLE [dbo].[TreeNodes](
[TN_ID] [int] IDENTITY(1,1) NOT NULL,
[TN_Title] [nvarchar](200) NOT NULL,
[TN_ParentID] [int] NULL,
CONSTRAINT [PK_TreeNodes] PRIMARY KEY CLUSTERED
(
[TN_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, _
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[TreeNodes] WITH CHECK ADD CONSTRAINT _
[FK_TreeNodes_TreeNodes] FOREIGN KEY([TN_ParentID])
REFERENCES [dbo].[TreeNodes] ([TN_ID])
GO
ALTER TABLE [dbo].[TreeNodes] CHECK CONSTRAINT [FK_TreeNodes_TreeNodes]
GO
Insertion
Now we should implement our code to easily Add Edit Remove and Move nodes in the tree.
For insertion operation, we have only two conditions:
- Nodes that haven't any parent
- Nodes that have a parent
So this is our insertion stored procedure:
CREATE PROCEDURE [dbo].[sp_insertTreeNode]
@parentid int,
@nt nvarchar(200),
@newid int output
AS
insert into TreeNodes (TN_Title, TN_ParentID)
values(@nt, @parentid)
set @newid = SCOPE_IDENTITY()
We return @newid
which is filled by newly inserted identification value, to the caller for further operations on it.
Update
After that, we have Update
procedure to update the information in a node:
CREATE PROCEDURE [dbo].[sp_updateTreeNode]
@tn_id int,
@newTitle nvarchar(200)
AS
update TreeNodes set tn_title = @newTitle
where tn_id = @tn_id
Deletion
Then, we should be able to delete a node. We should take care of deleting its children firstly otherwise we have orphans in our database. We will delete child nodes recursively.
In recursive calling of stored procedures and user defined functions, there are two obstacles you may encounter. Firstly, you may get maximum nested calls error and secondly you may get global cursor name conflict error if you have used them. I couldn't find a solution to overcome 32 level limit of nested procedure calls in SQL Server 2008 and for solving the latter problem, you may consider using this configuration command:
alter database DBName set CURSOR_DEFAULT LOCAL
So here is our delete procedure:
CREATE PROCEDURE [dbo].[sp_deleteTreeNode]
@tn_id int
AS
if (@@NESTLEVEL=1)
alter database DBName set CURSOR_DEFAULT LOCAL
declare @child_count int
select @child_count= count(*) from treenodes where tn_parentid=@tn_id
if @child_count>0
begin
declare @child_id int
DECLARE child_nodes CURSOR FOR
SELECT tn_id FROM treenodes
WHERE tn_parentid = @tn_id
OPEN child_nodes
FETCH NEXT FROM child_nodes INTO @child_id
WHILE @@FETCH_STATUS = 0
BEGIN
exec sp_deleteTreeNode @child_id
FETCH NEXT FROM child_nodes INTO @child_id
END
CLOSE child_nodes
DEALLOCATE child_nodes
end
delete from treenodes where tn_id= @tn_id
It first deletes child nodes recursively and then removes the requested node from the tree.
Note that setting CURSOR_DEFAULT
to LOCAL
prevents errors in recursive stored procedures that have a static
cursor name.
You may also consider using cascaded deletes if your DBMS supports and also using triggers on deletion to remove child nodes as an alternative method to direct removing of child nodes.
Move
For moving a node:
CREATE PROCEDURE [dbo].[sp_moveTreeNode]
@tn_id int,
@newParent int
AS
update TreeNodes set TN_ParentID = @newParent
where tn_id = @tn_id
List
For listing nodes in applications, we should list the childs of a node and then go one level more deep. Retrieving nodes recursively and building the tree in each depth is a very clear method to implement.
CREATE PROCEDURE [dbo].[sp_listNodes]
@tn_id as int
AS
select * from TreeNodes
where TN_ParentID= @tn_id
order by TN_ParentID
For getting the list of root nodes, we should call it with @tn_id
parameter set to null
.
Subtree List
Sometimes it's useful to have the list of whole children of a node:
CREATE FUNCTION [dbo].[fn_listSubTreeNodes]
(
@tn_id int
)
RETURNS
@res TABLE
(
cn_id int
)
AS
BEGIN
if @tn_id<1
RETURN
insert into @res values(@tn_id);
declare @child_count int
select @child_count= count(*) from treenodes where tn_parentid=@tn_id
if @child_count>0
begin
declare @child_id int
DECLARE child_nodes CURSOR FOR
SELECT tn_id FROM treenodes
WHERE tn_parentid = @tn_id
OPEN child_nodes
FETCH NEXT FROM child_nodes INTO @child_id
WHILE @@FETCH_STATUS = 0
BEGIN
insert @res
select * from DBName.dbo.fn_listSubTreeNodes(@child_id)
FETCH NEXT FROM child_nodes INTO @child_id
END
CLOSE child_nodes
DEALLOCATE child_nodes
end
RETURN
END
In this case, we have used table valued functions to accumulate results in a recursive manner. Consider that the observing node is included in the results. You may use @@NESTLEVEL
variable to avoid adding the root node to the result set when it is one.
Thanks to Mika Wendelius who noted me on using Common Table Expressions, here is an alternative implementation for listing subtree nodes:
CREATE PROCEDURE [dbo].[sp_listSubTreeNodes]
@tn_id as int
AS
BEGIN
with subtree(tn_id , tn_title , tn_parentid)
as
(
select * from TreeNodes where TN_ID = @tn_id
union all
select e.TN_ID, e.TN_Title ,e.TN_ParentID from TreeNodes _
as e inner join subtree on subtree.TN_ID=e.TN_ParentID
)
select * from subtree
END
Copy
Some users may want to duplicate some parts of our tree in another part of it so we should have copyTree
procedure:
CREATE PROCEDURE [dbo].[sp_copyTree]
@src_tn_id as int,
@tgt_tn_id as int
AS
declare @newTarget as int
if (@@NESTLEVEL=1)
alter database DBName set CURSOR_DEFAULT LOCAL
insert into TreeNodes(TN_ParentID, TN_Title) values _
(@tgt_tn_id, (select tn_title from TreeNodes where TN_ID=@src_tn_id))
set @newTarget = SCOPE_IDENTITY()
declare @child_count int
select @child_count=count(*) from treenodes where tn_parentid=@src_tn_id
if @child_count>0
begin
declare @child_id int
DECLARE child_nodes CURSOR FOR
SELECT tn_id FROM treenodes
WHERE tn_parentid = @src_tn_id
OPEN child_nodes
FETCH NEXT FROM child_nodes INTO @child_id
WHILE @@FETCH_STATUS = 0
BEGIN
exec sp_copyTree @child_id , @newTarget
FETCH NEXT FROM child_nodes INTO @child_id
END
CLOSE child_nodes
DEALLOCATE child_nodes
end
It starts from the source node and copies all of its children to their corresponding new parents.
Orphanage!
The nodes that their parents do not exist in database are orphans. You may not have orphans by using constraints in your database but if in some cases you felt that there may be some in your database as a result of application failures or lack of constraints, then you should decide on deleting orphan nodes cruelly or making them adopted by a new parent ! If you are the cruel person, here is the solution:
CREATE PROCEDURE [dbo].[sp_removeOrphans]
AS
declare @orphanCount int
select @orphanCount = COUNT(*) from TreeNodes where _
(TN_ParentID not in (select TN_ID from TreeNodes)) and (TN_ParentID is not null)
while (@orphanCount >0)
begin
delete TreeNodes where (TN_ParentID not in _
(select TN_ID from TreeNodes)) and (TN_ParentID is not null)
select @orphanCount = COUNT(*) from TreeNodes where _
(TN_ParentID not in (select TN_ID from TreeNodes)) _
and (TN_ParentID is not null)
end
It just deletes as many orphans as it sees!
But if some day, you wanted to restore them you should just set their root node parent to null
or new existing node.
Finally
It needs a user friendly interface to list, add, delete, edit, move or copy nodes in a database tree.
Remember the constraint of this tree is nested procedure calls level which is 32 as a default for SQL Server 2008 and global cursor names which was solved by this command:
alter database DBName set CURSOR_DEFAULT LOCAL
And consider that this model is not the most optimum designed model for tree structures but it could inspire some people to use them in their databases.
For other methods of implementing trees, consider the Member 3377871 comment at the end of this page.
Thank you very much.
Points of Interest
I got into trouble when SQL server showed me some errors in executing recursive procedures, but I managed to Google the problem and finally found the solution to overcome conflicts between global cursor names.
Also, it makes every programmer hesitant that nested procedure calls limit is 32 in SQL server 2008 and subsequently makes our tree depth limited to 32 levels.
History
- 25th February, 2011: Initial post