Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server

Implementing a Tree Structure with Database

4.82/5 (10 votes)
5 Mar 2011CPOL5 min read 81.3K  
This article describes implementing and manipulating a tree structure by means of SQL

Introduction

Tree structures are very useful in implementing hierarchical structures which are helpful for software developers to develop applications which are more realistic and tangible to the customers who will use them. For example, in implementing "departments structures" or "products tree" or "organization charts" with unknown level of depth tree nodes, it's inevitable to use these structures in database.

In this article, we will review stored procedures and functions needed to maintain a tree structure in a database. In this case, we have used Microsoft SQL Server 2008 as our DBMS. We will see how to List, Add, Edit, Delete, Move and Copy nodes in the tree with SQL and how to maintain our tree by removing orphan nodes. These procedures will help programmers to easily maintain tree structures in their Windows or web applications. After having these means, it's up to the programmer to provide a user friendly interface to work with tree structures.

Background

A tree structure consists of multiple nodes that every node has a parent node in such a way that we do not have any ring in our node relationships. We should have at least an Identification field for each node and a pointer to its parent node. It's obvious that root nodes don't have any parent. We could add extra information to each node's data structure regarding the problem we are handling.

Using the Code

Creation

First of all, we should create a Tree table in our database which is named TreeNodes. We have an identification field and a title and a pointer to the parent identification.

SQL
CREATE TABLE [dbo].[TreeNodes](
	[TN_ID] [int] IDENTITY(1,1) NOT NULL,
	[TN_Title] [nvarchar](200) NOT NULL,
	[TN_ParentID] [int] NULL,
 CONSTRAINT [PK_TreeNodes] PRIMARY KEY CLUSTERED
(
	[TN_ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, _
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

ALTER TABLE [dbo].[TreeNodes]  WITH CHECK ADD  CONSTRAINT _
[FK_TreeNodes_TreeNodes] FOREIGN KEY([TN_ParentID])
REFERENCES [dbo].[TreeNodes] ([TN_ID])
GO

ALTER TABLE [dbo].[TreeNodes] CHECK CONSTRAINT [FK_TreeNodes_TreeNodes]
GO

Insertion

Now we should implement our code to easily Add Edit Remove and Move nodes in the tree.

For insertion operation, we have only two conditions:

  1. Nodes that haven't any parent
  2. Nodes that have a parent

So this is our insertion stored procedure:

SQL
CREATE PROCEDURE [dbo].[sp_insertTreeNode]
@parentid int,
@nt nvarchar(200),
@newid int output
AS
insert into TreeNodes (TN_Title, TN_ParentID)
   values(@nt, @parentid)
set @newid = SCOPE_IDENTITY()

We return @newid which is filled by newly inserted identification value, to the caller for further operations on it.

Update

After that, we have Update procedure to update the information in a node:

SQL
CREATE PROCEDURE [dbo].[sp_updateTreeNode]
@tn_id int,
@newTitle nvarchar(200)
AS
update TreeNodes set tn_title = @newTitle
where tn_id = @tn_id      

Deletion

Then, we should be able to delete a node. We should take care of deleting its children firstly otherwise we have orphans in our database. We will delete child nodes recursively.

In recursive calling of stored procedures and user defined functions, there are two obstacles you may encounter. Firstly, you may get maximum nested calls error and secondly you may get global cursor name conflict error if you have used them. I couldn't find a solution to overcome 32 level limit of nested procedure calls in SQL Server 2008 and for solving the latter problem, you may consider using this configuration command:

SQL
alter database DBName set CURSOR_DEFAULT  LOCAL

So here is our delete procedure:

SQL
CREATE PROCEDURE [dbo].[sp_deleteTreeNode]
@tn_id int
AS
if (@@NESTLEVEL=1)
	alter database DBName set CURSOR_DEFAULT  LOCAL

declare @child_count int

select @child_count=  count(*) from treenodes where tn_parentid=@tn_id

if @child_count>0
begin
	--firstly delete child nodes recursively
	declare @child_id int
	DECLARE child_nodes  CURSOR FOR
		SELECT tn_id FROM treenodes
		WHERE tn_parentid = @tn_id

	OPEN child_nodes

	FETCH NEXT FROM child_nodes INTO @child_id

	WHILE @@FETCH_STATUS = 0
	BEGIN
	   exec sp_deleteTreeNode @child_id

	   FETCH NEXT FROM child_nodes INTO @child_id
	END

	CLOSE child_nodes
	DEALLOCATE child_nodes
end

--delete a node without any child
delete from treenodes where tn_id= @tn_id

It first deletes child nodes recursively and then removes the requested node from the tree.

Note that setting CURSOR_DEFAULT to LOCAL prevents errors in recursive stored procedures that have a static cursor name.

You may also consider using cascaded deletes if your DBMS supports and also using triggers on deletion to remove child nodes as an alternative method to direct removing of child nodes.

Move

For moving a node:

SQL
CREATE PROCEDURE [dbo].[sp_moveTreeNode]
@tn_id int,
@newParent int
AS
update TreeNodes set TN_ParentID = @newParent
where tn_id = @tn_id

List

For listing nodes in applications, we should list the childs of a node and then go one level more deep. Retrieving nodes recursively and building the tree in each depth is a very clear method to implement.

SQL
CREATE PROCEDURE [dbo].[sp_listNodes]
@tn_id as int
AS
select * from TreeNodes
where TN_ParentID= @tn_id
order by TN_ParentID

For getting the list of root nodes, we should call it with @tn_id parameter set to null.

Subtree List

Sometimes it's useful to have the list of whole children of a node:

SQL
CREATE FUNCTION [dbo].[fn_listSubTreeNodes]
(
@tn_id int
)
RETURNS
@res TABLE
(
	cn_id int
)
AS
BEGIN
	if @tn_id<1
		RETURN

	insert into @res values(@tn_id);

	declare @child_count int
	select @child_count=  count(*) from treenodes where tn_parentid=@tn_id
	if @child_count>0
	begin
		declare @child_id int
		DECLARE child_nodes  CURSOR FOR
			SELECT tn_id FROM treenodes
			WHERE tn_parentid = @tn_id

		OPEN child_nodes

		FETCH NEXT FROM child_nodes INTO @child_id

		WHILE @@FETCH_STATUS = 0
		BEGIN

			insert @res
			select * from DBName.dbo.fn_listSubTreeNodes(@child_id)

		   FETCH NEXT FROM child_nodes INTO @child_id
		END

		CLOSE child_nodes
		DEALLOCATE child_nodes
	end

	RETURN
END

In this case, we have used table valued functions to accumulate results in a recursive manner. Consider that the observing node is included in the results. You may use @@NESTLEVEL variable to avoid adding the root node to the result set when it is one.

Thanks to Mika Wendelius who noted me on using Common Table Expressions, here is an alternative implementation for listing subtree nodes:

SQL
CREATE PROCEDURE [dbo].[sp_listSubTreeNodes]
@tn_id as int
AS
BEGIN
with subtree(tn_id , tn_title , tn_parentid)
as
(
select * from TreeNodes where TN_ID = @tn_id
union all
select e.TN_ID, e.TN_Title ,e.TN_ParentID from TreeNodes _
as e inner join subtree on subtree.TN_ID=e.TN_ParentID
)
select * from subtree
END

Copy

Some users may want to duplicate some parts of our tree in another part of it so we should have copyTree procedure:

SQL
CREATE PROCEDURE [dbo].[sp_copyTree]
@src_tn_id as int,
@tgt_tn_id as int
AS
declare @newTarget as int

if (@@NESTLEVEL=1)
	alter database DBName set CURSOR_DEFAULT  LOCAL

insert into TreeNodes(TN_ParentID, TN_Title) values _
(@tgt_tn_id, (select tn_title from TreeNodes where TN_ID=@src_tn_id))
set @newTarget = SCOPE_IDENTITY()

declare @child_count int
select @child_count=count(*) from treenodes where tn_parentid=@src_tn_id

if @child_count>0
begin
	declare @child_id int
	DECLARE child_nodes  CURSOR FOR
		SELECT tn_id FROM treenodes
		WHERE tn_parentid = @src_tn_id

	OPEN child_nodes

	FETCH NEXT FROM child_nodes INTO @child_id

	WHILE @@FETCH_STATUS = 0
	BEGIN
	   exec sp_copyTree @child_id , @newTarget
	   FETCH NEXT FROM child_nodes INTO @child_id
	END

	CLOSE child_nodes
	DEALLOCATE child_nodes
end

It starts from the source node and copies all of its children to their corresponding new parents.

Orphanage!

The nodes that their parents do not exist in database are orphans. You may not have orphans by using constraints in your database but if in some cases you felt that there may be some in your database as a result of application failures or lack of constraints, then you should decide on deleting orphan nodes cruelly or making them adopted by a new parent ! If you are the cruel person, here is the solution:

SQL
CREATE PROCEDURE [dbo].[sp_removeOrphans]
AS
declare @orphanCount int

select @orphanCount = COUNT(*) from TreeNodes where _
(TN_ParentID not in (select TN_ID from TreeNodes))  and  (TN_ParentID is not null)
while (@orphanCount >0)
begin
	delete TreeNodes where (TN_ParentID not in _
	(select TN_ID from TreeNodes))  and  (TN_ParentID is not null)
	select @orphanCount = COUNT(*) from TreeNodes where _
	(TN_ParentID not in (select TN_ID from TreeNodes))  _
	and  (TN_ParentID is not null)
end

It just deletes as many orphans as it sees!
But if some day, you wanted to restore them you should just set their root node parent to null or new existing node.

Finally

It needs a user friendly interface to list, add, delete, edit, move or copy nodes in a database tree.

Remember the constraint of this tree is nested procedure calls level which is 32 as a default for SQL Server 2008 and global cursor names which was solved by this command:

SQL
alter database DBName set CURSOR_DEFAULT  LOCAL

And consider that this model is not the most optimum designed model for tree structures but it could inspire some people to use them in their databases.
For other methods of implementing trees, consider the Member 3377871 comment at the end of this page.

Thank you very much.

Points of Interest

I got into trouble when SQL server showed me some errors in executing recursive procedures, but I managed to Google the problem and finally found the solution to overcome conflicts between global cursor names.

Also, it makes every programmer hesitant that nested procedure calls limit is 32 in SQL server 2008 and subsequently makes our tree depth limited to 32 levels.

History

  • 25th February, 2011: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)