Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / MySQL

Naive Approach to Recursive MySQL

5.00/5 (1 vote)
12 Nov 2015CPOL3 min read 8K  
Provides simplistic solution to a recursive MySQL table

Introduction

Solves the problem of missing recursive statement in MySQL. It is a simplistic solution as we have to hardcode the maximum number of hierarchy levels, but when they are known, this approach can serve the purpose. It retrieves the data in a hierarchical order so the tree can be displayed easily recursively from the dataset.

Background

Other solutions don't fit the model or are too complex for a simple pre-known dept hierarchy. If in future, recursive SQL statement is available in MySQL, we can easily change the query to produce the same output.

The approach is a UNION ALL of all the different levels (depths) in the hierarchy and then sorting to arrange the data for an easy recursion afterwards when displaying.

Using the Code

Example MySQL tblhierarchy table:

id parent_id name
1 0 Level1 1
2 0 Level1 2
11 1 Level2 11
12 1 Level2 12
21 2 Level2 21
13 1 Level2 13
22 2 Level2 22
111 11 Level3 111
112 11 Level3 112
211 21 Level3 211
2111 211 Level4 2111
2112 211 Level4 2112

Example MySQL query for hierarchy with up to 4 levels to select the data ready for recursion:

SQL
select id,parent_id,name,level
 from
 (
     SELECT l1.* , 1 level, l1.id l1_sort , 0 l2_sort, 0 l3_sort,0 l4_sort
     FROM tblhierarchy l1
     WHERE IFNULL(l1.parent_id,0) = 0
     Union all
     SELECT l2.* , 2 level, l1.id l1_sort , l2.id l2_sort, 0 l3_sort,0 l4_sort
     FROM tblhierarchy l1
     inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
     Union all
     SELECT l3.* , 3 level, l1.id l1_sort , l2.id l2_sort, l3.id l3_sort,0 l4_sort
     FROM tblhierarchy l1
     inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
     inner join tblhierarchy l3 on l3.parent_id = l2.id
     Union all
     SELECT l4.* , 4 level, l1.id l1_sort , l2.id l2_sort, l3.id l3_sort,l4.id l4_sort
     FROM tblhierarchy l1
     inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
     inner join tblhierarchy l3 on l3.parent_id = l2.id
     inner join tblhierarchy l4 on l4.parent_id = l3.id
 ) r
 order by l1_sort,l2_sort,l3_sort,l4_sort

Example output:

id parent_id CONCAT(LPAD('',(r.level-1)*6,' '), '|-----', r.name) level
1 0 |-----Level1 1 1
11 1       |-----Level2 11 2
111 11             |-----Level3 111 3
112 11             |-----Level3 112 3
12 1       |-----Level2 12 2
13 1       |-----Level2 13 2
2 0 |-----Level1 2 1
21 2       |-----Level2 21 2
211 21             |-----Level3 211 3
2111 211                   |-----Level4 2111 4
2112 211                   |-----Level4 2112 4
22 2       |-----Level2 22 2

For this output, we are using:

SQL
select id,parent_id,CONCAT(LPAD('',(level-1)*6),' '), '|-----', name),level 

instead of:

SQL
select id,parent_id,name,level

to have better visual output demonstrating the accuracy of the result.

If we want to do up to 5th level, the query would look like this:

SQL
select id,parent_id,name,level
from
(
    SELECT l1.* , 1 level, l1.id l1_sort , 0 l2_sort, 0 l3_sort,0 l4_sort,0 l5_sort
    FROM tblhierarchy l1
    WHERE IFNULL(l1.parent_id,0) = 0
    Union all
    SELECT l2.* , 2 level, l1.id l1_sort , l2.id l2_sort, 0 l3_sort,0 l4_sort,0 l5_sort
    FROM tblhierarchy l1
    inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
    Union all
    SELECT l3.* , 3 level, l1.id l1_sort , l2.id l2_sort, l3.id l3_sort,0 l4_sort,0 l5_sort
    FROM tblhierarchy l1
    inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
    inner join tblhierarchy l3 on l3.parent_id = l2.id
    Union all
    SELECT l4.* , 4 level, l1.id l1_sort , l2.id l2_sort, l3.id l3_sort,l4.id l4_sort,0 l5_sort
    FROM tblhierarchy l1
    inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
    inner join tblhierarchy l3 on l3.parent_id = l2.id
    inner join tblhierarchy l4 on l4.parent_id = l3.id
    Union all
    SELECT l5.*, 5 level, l1.id l1_sort, l2.id l2_sort,l3.id l3_sort,l4.id l4_sort,l5.id l5_sort
    FROM tblhierarchy l1
    inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
    inner join tblhierarchy l3 on l3.parent_id = l2.id
    inner join tblhierarchy l4 on l4.parent_id = l3.id
    inner join tblhierarchy l5 on l5.parent_id = l4.id
) r
order by l1_sort,l2_sort,l3_sort,l4_sort,l5_sort

It is not the most elegant and efficient approach at all but it is easy to understand and for simple hierarchies should fit the purpose.

If we want to sort the levels by some other condition than the level id, we have to use a slightly different approach. We have to rank the levels by the sort criteria we want to apply. As there is no natural ranking in MySQL, an alternative approach is used. We are linking every hierarchy row with its own levels ranks and then sort by that rank:

SQL
    select r.id,r.parent_id,r.name,r.level
    from
    (
        SELECT l1.* , 1 level, l1.id l1id , 0 l2id, 0 l3id,0 l4id
        FROM tblhierarchy l1
        WHERE IFNULL(l1.parent_id,0) = 0
        Union all
        SELECT l2.* , 2 level, l1.id l1id , l2.id l2id, 0 l3id,0 l4id
        FROM tblhierarchy l1
        inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
        Union all
        SELECT l3.* , 3 level, l1.id l1id , l2.id l2id, l3.id l3id,0 l4id
        FROM tblhierarchy l1
        inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
        inner join tblhierarchy l3 on l3.parent_id = l2.id
        Union all
        SELECT l4.* , 4 level, l1.id l1id , l2.id l2id, l3.id l3id, l4.id l4id
        FROM tblhierarchy l1
        inner join tblhierarchy l2 on l2.parent_id = l1.id and IFNULL(l1.parent_id,0) = 0
        inner join tblhierarchy l3 on l3.parent_id = l2.id
        inner join tblhierarchy l4 on l4.parent_id = l3.id
    ) r
    left join(
            select tblhierarchy.*,@rank := @rank + 1 rank
            from tblhierarchy ,(select @rank := 0) rnk
    order by name desc
) l1 on l1.id = l1id
    left join(
            select tblhierarchy.*,@rank := @rank + 1 rank
            from tblhierarchy ,(select @rank := 0) rnk
    order by name desc
) l2 on l2.id = l2id
    left join(
            select tblhierarchy.*,@rank := @rank + 1 rank
            from tblhierarchy ,(select @rank := 0) rnk
    order by name desc
) l3 on l3.id = l3id
    left join(
            select tblhierarchy.*,@rank := @rank + 1 rank
            from tblhierarchy ,(select @rank := 0) rnk
    order by name desc
) l4 on l4.id = l4id

    order by l1.rank,l2.rank,l3.rank,l4.rank

The above example shows how to sort by name desc in every hierarchy level and the result is:

id parent_id CONCAT(LPAD('',(r.level-1)*6,' '), '|-----', r.name) level
2 0 |-----Level1 2 1
22 2       |-----Level2 22 2
21 2       |-----Level2 21 2
211 21             |-----Level3 211 3
2112 211                   |-----Level4 2112 4
2111 211                   |-----Level4 2111 4
1 0 |-----Level1 1 1
13 1       |-----Level2 13 2
12 1       |-----Level2 12 2
11 1       |-----Level2 11 2
112 11             |-----Level3 112 3
111 11             |-----Level3 111 3

 

If we want to select or output only a subhierarchy we specify the value of the node we want to start from.

Instead of the condition that selects the whole hierarchy

SQL
IFNULL(l1.parent_id,0) = 0

we substitute with 

SQL
l1.id = 21

Ofcouse the value 21 can be passed as a parameter.

 

SQL
    select r.id,r.parent_id,r.name,r.level
    from
    (
        SELECT l1.* , 1 level, l1.id l1id , 0 l2id, 0 l3id,0 l4id
        FROM tblhierarchy l1
        WHERE l1.id = 21
        Union all
        SELECT l2.* , 2 level, l1.id l1id , l2.id l2id, 0 l3id,0 l4id
        FROM tblhierarchy l1
        inner join tblhierarchy l2 on l2.parent_id = l1.id and l1.id = 21
        Union all
        SELECT l3.* , 3 level, l1.id l1id , l2.id l2id, l3.id l3id,0 l4id
        FROM tblhierarchy l1
        inner join tblhierarchy l2 on l2.parent_id = l1.id and l1.id = 21
        inner join tblhierarchy l3 on l3.parent_id = l2.id
        Union all
        SELECT l4.* , 4 level, l1.id l1id , l2.id l2id, l3.id l3id, l4.id l4id
        FROM tblhierarchy l1
        inner join tblhierarchy l2 on l2.parent_id = l1.id and l1.id = 21
        inner join tblhierarchy l3 on l3.parent_id = l2.id
        inner join tblhierarchy l4 on l4.parent_id = l3.id
    ) r
    left join(
            select tblhierarchy.*,@rank := @rank + 1 rank
            from tblhierarchy ,(select @rank := 0) rnk
    order by name desc
) l1 on l1.id = l1id
    left join(
            select tblhierarchy.*,@rank := @rank + 1 rank
            from tblhierarchy ,(select @rank := 0) rnk
    order by name desc
) l2 on l2.id = l2id
    left join(
            select tblhierarchy.*,@rank := @rank + 1 rank
            from tblhierarchy ,(select @rank := 0) rnk
    order by name desc
) l3 on l3.id = l3id
    left join(
            select tblhierarchy.*,@rank := @rank + 1 rank
            from tblhierarchy ,(select @rank := 0) rnk
    order by name desc
) l4 on l4.id = l4id

    order by l1.rank,l2.rank,l3.rank,l4.rank

 

id parent_id CONCAT(LPAD('',(r.level-1)*6,' '), '|-----', r.name) level
21 2 |-----Level2 21 1
211 21       |-----Level3 211 2
2112 211             |-----Level4 2112 3
2111 211             |-----Level4 2111 3

 

If we want to select the path from the root of the hierarchy to a specific node we can use the query 

SQL
select id,parent_id,name,@rank := @rank + 1 level
from (
	SELECT l1.id,l1.parent_id,l1.name,
	        (
		        case when l1.id is NULL then 0 else 1 end +
		        case when l2.id is NULL then 0 else 1 end +
		        case when l3.id is NULL then 0 else 1 end +
		        case when l4.id is NULL then 0 else 1 end
                )  rank
        FROM  tblhierarchy l1
        left join tblhierarchy l2 on l2.parent_id = l1.id and l2.parent_id <> 211
        left join tblhierarchy l3 on l3.parent_id = l2.id and l3.parent_id <> 211
        left join tblhierarchy l4 on l4.parent_id = l3.id and l4.parent_id <> 211
        where l1.id = 211 || l2.id = 211 || l3.id = 211 || l4.id = 211
        order by rank desc
)d , (select @rank := 0) rnk

Again we can substitute the value 211 with parameter

id parent_id CONCAT(LPAD('',(r.level-1)*6,' '), '|-----', r.name) level
2 0 |-----Level1 2 1
21 2       |-----Level2 21 2
211 21             |-----Level3 211 3

 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)