Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / XHTML

Tree Relationship Calculator

4.00/5 (7 votes)
21 Oct 2008CPOL7 min read 1   665  
The Tree Relationship Calculator is an object that accepts an XML representation of a tree and will calculate the relationship of any two members within it. This article describes how relationships are calculated, and what terms like second cousin, or first cousin once removed, mean.

Introduction

The Tree Relationship Calculator is an object that accepts an XML representation of a tree and will calculate the relationship of any two members within it. This article describes how relationships are calculated, and what terms like second cousin, or first cousin once removed, mean. This code includes an object for calculating relationships, written in JavaScript, as well as a web UI for rendering and interacting with the tree. The example project is setup as a classic ASP page.

Family Tree Relationship Calculator Screenshot

The Tree as a Data Structure

Trees are non-linear data structures that have many applications in Computer Science. Non-linear structures represent a complex linking between components. Specifically, a tree is a type of graph where each node has at most one parent and zero or more child nodes.

The topmost node of a tree is called the root node, and each node in the tree is the root node of a sub-tree. The parent node is the one that is linked directly above the current node. Much of the terminology used in describing trees comes from the names we use to describe family relationships. For example, two nodes with the same parent are called siblings. We also use the terms ancestor, descendant, and children to describe the relative positions of nodes. For your reference, I'm including a few links that explain some of the concepts and terminology discussed: tree, data structure, recursion.

The depth of a node is another important concept that we will use when calculating relationships. If you start at a given node and move upwards to its parent, we would call that one step. Moving up to the parent of the parent would be two steps. If you keep moving up the tree to the root node, counting the steps, you would have the depth of the node or the distance from the node to the root.

Measuring Relationships

While terms like ancestor and descendant offer vague references to the relative position of two nodes in a tree, genealogists use much more precise terminology to describe relationships, which allows them to explain the exact relative position of two members. The term "third cousin once removed" is one example, and this is the type of relationship our program will calculate.

There are two components that measure the type of relationship of two people in a family tree, and these measure the distance horizontally and vertically from each family member. The first component describes the horizontal distance between two family members, and is measured by how far apart their closest common relative is. For example, your first cousin shares a common grandparent who is two levels up. A second cousin is someone with whom your closest common ancestor is your great grand parent, three levels up.

To figure out the level of cousin, you take the number of generations away your closest common relative is and subtract it by one. A common great-grand parent (3) means you are (3-1= 2) second cousins, a common grand parent means you are first cousins. A common parent means you are zero-cousins, which we call a sibling. The diagram below shows that Cyclops and Hemera are first cousins because they share a common grandparent.

rc2.png

The second component measures the vertical distance in the tree, and describes how far separated you are by generation. This is the “once removed” part of the phrase “first cousin once removed”. For instance, your mother is one generation older than you, so she is once removed. Your grandmother is two generations older, so she is twice removed. You are a direct descendent of both your grandmother and mother and don't use the “removed” terminology; however, the principal is the same for more distant relatives. Your second cousin twice removed would be of the same generation as your grandmother.

The diagram below shows that Hemera and Leto share Chaos as their closest common ancestor. Chaos is two generations away from Hemera and four from Leto. We use the closer of the two to determine the cousin factor, and the difference between the two to calculate the removed factor. This tells us that Hemera and Leto are first cousins twice removed.

rc1.png

  • Distance from closest common ancestor: Hemera:2, Leto:4
    • Hemera(2) – 1 = 1st Cousin
    • Leto(4) – Hemera(2) = 2 times removed

Named Relationships

Using these two components, you can calculate the relationship to any member in your tree. Most of our immediate family have special names for their relationship, but follow the same rules as more distant second, third, and fourth cousins. For instance, your zero cousin once removed would be your parent. Your zero cousin zero times removed is your sibling. The following table shows how named relationships match up to the cousin/removed convention.

C = lesser distance between the closest common ancestor

R = distance from further relative minus the distance from closer relative

test.jpg

XML Structure

Our program for calculating relationships will take as input an XML document representing the family tree, which will be represented by a root <familytree> node which holds one <person> node, which in turn contains a <children> node that can contain zero-many <person> nodes. Here is the entire XML file representing our example family tree. Our example is a partial family tree of the Greek gods I created from information found on the Theoi Greek Mythology website, which is worth taking a look at.

xml.png

The Algorithm

Here is the algorithm we will follow for calculating an arbitrary relationship:

  1. Find the position in the tree of both members
  2. Use the positions to find the distance of each to their closest common ancestor
  3. Use the distances to determine the cousin (c) and removed (r) factors
  4. Use the c and r variables to generate a text name for the relationship

Recursive Search Function

Recursion and self-similarity are two other important concepts in Computer Science, and they are particularly useful when dealing with trees. To find the position of each member, our program will call a recursive search function to navigate each node of the tree and record its current position as it goes. When a match is found, the position is saved. The position is stored as an array, with one element of the array for each generation. The integer stored represents the order that child was in, starting at zero. For instance, the position array for Hemera below would be {0, 2, 1}, tracking the order of each child down the tree: Chaos(0), Nyx(2), Hemera(1).

rc3.png

Here is the recursive search function for our program. This function accepts in an XML node, and checks to see if it matches either ID we are looking for, then calls itself on each child of the node.

JavaScript
function testNode(PerNode, depth,idFirst, idLast){
    if (PerNode.getAttribute("id") == null) 
        return false;
    var id = parseInt(PerNode.getAttribute("id"));
    depth++;

    // Look for a match for either person
    if (id==idFirst) this.arFirst = this.arPos.copy();
    if (id==idLast){ this.arLast = this.arPos.copy();}

    // If both people have been found, end recursion
    if (this.arFirst !=null && this.arLast!=null)return true;

    // If no children end recursion
    var ChildrenNodes = PerNode.getElementsByTagName("children");
    if (!ChildrenNodes) return false;
    
    // For each child of the given node, call this same function
    for (var c=0,k=0; c<ChildrenNode.childNodes.length; c++){
        k++;
        this.arPos.length = depth;
        this.arPos[depth] = k;
        var ret = this.TestNode(ChildrenNode.childNodes.item(c), 
        depth, idFirst, idLast)
        if (ret == true) return true;
    }
    return false;
}

Step 1: Initiate the Search

The getRelById() function is the method that initiates the recursive search. Once the search is complete, we will have two arrays that describe the position of both members in the tree. The getRelByArray() function is then called to calculate the relationship based on the position arrays.

JavaScript
function getRelById(idFirst, idLast)
{
    if (isNaN(idFirst) || isNaN(idLast)) 
        throw new Error("Id is not numeric");
    idFirst = parseInt(idFirst);
    idLast = parseInt(idLast);
    if (!this.xmlDoc) return false;

    // Start the search. If both matches are not found return false
    this.arPos[0]=0;
    var RootNode = this.xmlDoc.documentElement.firstChild;
    if (!this.TestNode(RootNode,0,idFirst, idLast)) return false;
    
    // Get the named relationship based on the position arrays
    return this.GetRelByArray(this.arFirst, this.arLast, this.g);
}

Step 2: Convert Positions into Distance from Closest Ancestor

The getRelByArray() function compares the two arrays, and determines the distance from the closest common ancestor for each member. This data is then sent to getRelByDistance() to calculate the relationship further.

JavaScript
function getRelByArray(arF, arL, g)
{
    // Set the gender flag
    g = (g==0||g==1)? g : 2;
    
    // Get length of the shorter array
    var len = (arF.length<arL.length)? arF.length : arL.length;
    
    // Compare the arrays to find out where they deviate.
    // i records how far down the arrays are the same
    for (var i=0; i<len; i++){if (arF[i] != arL[i]) break;}
    
    // Subtract i from the length of each array. This re-factors
    // the length in relation to the closest common relative
    var a = (arF.length - i);
    var b = (arL.length - i);
    
    // Use a,b (the distance of each to the closest common relative)
    // to generate the relationship label
    return(this.GetRelByDistance(a,b,g));
}

Step 3: Convert Ancestor Distance into R and C Factors

The getRelByDistance() function will take in the distance of both members to their closest common ancestor, and calculate the removed (r) and cousin (c) factors for the relationship. This is sent to getRelByRC(), which generates a text name for the relationship.

JavaScript
function getRelByDistance(a,b,g)
{
    a = parseInt(a);
    b = parseInt(b);
    g = (g==0||g==1)? g : 2;
    var c = ((b>=a)?  a: a - (a - b))
    var r = a - b;
    return(this.GetRelByRC(c,r,g));
}

Step 4: Generating a Text Name Using R and C

Finally, the getRelByRC() function will create a text name of the relationship. For most relationships, we simply need to subtract 1 from the c factor to get something like c3:r1 equals “second cousin once removed”. For the named relationships (zero cousins, etc.). we do a simple lookup to match the c and r variables to the predefined names. The gender variable g is passed so we know when to use aunt instead of uncle and mother instead of father.

JavaScript
function getRelByRC(c,r,g)
{
    var strName = "";
    var absR;
    var g = (g==0||g==1)? g : 2;
    var arYou = [["same person","same person","same person"],
        ["brother","sister","sibling"]];
    var arStatic = new Array();

    arStatic[0] = [["son","daughter","child"],
        ["father","mother","parent"]];
    arStatic[1] = [["nephew","neice","nephew/neice"],
        ["uncle","aunt","uncle/aunt"]];

    // c==0 are your direct descendants 
    if (c==0) { 
        absR = Math.abs(r); 
        if (absR==0) strName = arYou[c][g]; 
        if (absR>0) strName = (r<0)? arStatic[c][0][g] : arStatic[c][1][g]; 
        if (absR>1) strName = " grand " + strName; 
        if (absR>2) strName = " great " + strName; 
        if (absR>3) strName = rcFormatPrefix(absR-2) + strName; 
    } 

    // c==1 are zero cousins 
    else if(c==1) { 
        absR = Math.abs(r); 
        if (absR==0) strName = arYou[c][g]; 
        if (absR>0)  strName = (r<0)? arStatic[c][0][g] : arStatic[c][1][g]; 
        if (absR==2) strName = " great "; + strName; 
        else{ 
        if (absR>1) strName = "grand" + strName; 
        if (absR>2) strName = " great " + strName; 
        if (absR>3) strName = rcFormatPrefix(absR-1) + strName; 
        }
    } 

    // c>1 are all other 2nd, 3rd, etc... cousins 
    else{ 
        var cousin = rcFormatPrefix(c-1) + " cousin "; 
        var removed = (Math.abs(r)>0)? Math.abs(r) + " times removed." : ""; 
        strName = cousin + removed; 
    } 

    return strName; 
}

Using the Code

The functions described above make up the methods of our object. The entire object can be found in the file Rel.txt [4.98 KB]. To calculate a relationship, we simply instantiate the object, passing a reference to our XML tree, then call the GetRelById() function. The result is a string representation of the relationship.

JavaScript
var relObj = new RelCalculator(xmlDoc);
var strRel = relObj.GetRelById(100, 143);

Our code also includes a UI which renders the XML as a visual tree using XHTML and CSS. The UI allows us to visualize the tree and make calls to the Relationship Calculator through a drag-and-drop interface. I won't go into the specifics of how this page is rendered, but here are the files if you are interested. You can also try out the working example.

calc_small.jpg

calc_small.jpg

History

  • 21st October, 2008: Initial post.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)