Introduction
The Tree Relationship Calculator is an object that accepts an XML representation of a tree and will calculate the relationship of any two members within it. This article describes how relationships are calculated, and what terms like second cousin, or first cousin once removed, mean. This code includes an object for calculating relationships, written in JavaScript, as well as a web UI for rendering and interacting with the tree. The example project is setup as a classic ASP page.
The Tree as a Data Structure
Trees are non-linear data structures that have many applications in Computer Science. Non-linear structures represent a complex linking between components. Specifically, a tree is a type of graph where each node has at most one parent and zero or more child nodes.
The topmost node of a tree is called the root node, and each node in the tree is the root node of a sub-tree. The parent node is the one that is linked directly above the current node. Much of the terminology used in describing trees comes from the names we use to describe family relationships. For example, two nodes with the same parent are called siblings. We also use the terms ancestor, descendant, and children to describe the relative positions of nodes. For your reference, I'm including a few links that explain some of the concepts and terminology discussed: tree, data structure, recursion.
The depth of a node is another important concept that we will use when calculating relationships. If you start at a given node and move upwards to its parent, we would call that one step. Moving up to the parent of the parent would be two steps. If you keep moving up the tree to the root node, counting the steps, you would have the depth of the node or the distance from the node to the root.
Measuring Relationships
While terms like ancestor and descendant offer vague references to the relative position of two nodes in a tree, genealogists use much more precise terminology to describe relationships, which allows them to explain the exact relative position of two members. The term "third cousin once removed" is one example, and this is the type of relationship our program will calculate.
There are two components that measure the type of relationship of two people in a family tree, and these measure the distance horizontally and vertically from each family member. The first component describes the horizontal distance between two family members, and is measured by how far apart their closest common relative is. For example, your first cousin shares a common grandparent who is two levels up. A second cousin is someone with whom your closest common ancestor is your great grand parent, three levels up.
To figure out the level of cousin, you take the number of generations away your closest common relative is and subtract it by one. A common great-grand parent (3) means you are (3-1= 2) second cousins, a common grand parent means you are first cousins. A common parent means you are zero-cousins, which we call a sibling. The diagram below shows that Cyclops and Hemera are first cousins because they share a common grandparent.
The second component measures the vertical distance in the tree, and describes how far separated you are by generation. This is the “once removed” part of the phrase “first cousin once removed”. For instance, your mother is one generation older than you, so she is once removed. Your grandmother is two generations older, so she is twice removed. You are a direct descendent of both your grandmother and mother and don't use the “removed” terminology; however, the principal is the same for more distant relatives. Your second cousin twice removed would be of the same generation as your grandmother.
The diagram below shows that Hemera and Leto share Chaos as their closest common ancestor. Chaos is two generations away from Hemera and four from Leto. We use the closer of the two to determine the cousin factor, and the difference between the two to calculate the removed factor. This tells us that Hemera and Leto are first cousins twice removed.
- Distance from closest common ancestor: Hemera:2, Leto:4
- Hemera(2) – 1 = 1st Cousin
- Leto(4) – Hemera(2) = 2 times removed
Named Relationships
Using these two components, you can calculate the relationship to any member in your tree. Most of our immediate family have special names for their relationship, but follow the same rules as more distant second, third, and fourth cousins. For instance, your zero cousin once removed would be your parent. Your zero cousin zero times removed is your sibling. The following table shows how named relationships match up to the cousin/removed convention.
C = lesser distance between the closest common ancestor
R = distance from further relative minus the distance from closer relative
XML Structure
Our program for calculating relationships will take as input an XML document representing the family tree, which will be represented by a root <familytree>
node which holds one <person>
node, which in turn contains a <children>
node that can contain zero-many <person>
nodes. Here is the entire XML file representing our example family tree. Our example is a partial family tree of the Greek gods I created from information found on the Theoi Greek Mythology website, which is worth taking a look at.
The Algorithm
Here is the algorithm we will follow for calculating an arbitrary relationship:
- Find the position in the tree of both members
- Use the positions to find the distance of each to their closest common ancestor
- Use the distances to determine the cousin (c) and removed (r) factors
- Use the c and r variables to generate a text name for the relationship
Recursive Search Function
Recursion and self-similarity are two other important concepts in Computer Science, and they are particularly useful when dealing with trees. To find the position of each member, our program will call a recursive search function to navigate each node of the tree and record its current position as it goes. When a match is found, the position is saved. The position is stored as an array, with one element of the array for each generation. The integer stored represents the order that child was in, starting at zero. For instance, the position array for Hemera below would be {0, 2, 1}, tracking the order of each child down the tree: Chaos(0), Nyx(2), Hemera(1).
Here is the recursive search function for our program. This function accepts in an XML node, and checks to see if it matches either ID we are looking for, then calls itself on each child of the node.
function testNode(PerNode, depth,idFirst, idLast){
if (PerNode.getAttribute("id") == null)
return false;
var id = parseInt(PerNode.getAttribute("id"));
depth++;
if (id==idFirst) this.arFirst = this.arPos.copy();
if (id==idLast){ this.arLast = this.arPos.copy();}
if (this.arFirst !=null && this.arLast!=null)return true;
var ChildrenNodes = PerNode.getElementsByTagName("children");
if (!ChildrenNodes) return false;
for (var c=0,k=0; c<ChildrenNode.childNodes.length; c++){
k++;
this.arPos.length = depth;
this.arPos[depth] = k;
var ret = this.TestNode(ChildrenNode.childNodes.item(c),
depth, idFirst, idLast)
if (ret == true) return true;
}
return false;
}
Step 1: Initiate the Search
The getRelById()
function is the method that initiates the recursive search. Once the search is complete, we will have two arrays that describe the position of both members in the tree. The getRelByArray()
function is then called to calculate the relationship based on the position arrays.
function getRelById(idFirst, idLast)
{
if (isNaN(idFirst) || isNaN(idLast))
throw new Error("Id is not numeric");
idFirst = parseInt(idFirst);
idLast = parseInt(idLast);
if (!this.xmlDoc) return false;
this.arPos[0]=0;
var RootNode = this.xmlDoc.documentElement.firstChild;
if (!this.TestNode(RootNode,0,idFirst, idLast)) return false;
return this.GetRelByArray(this.arFirst, this.arLast, this.g);
}
Step 2: Convert Positions into Distance from Closest Ancestor
The getRelByArray()
function compares the two arrays, and determines the distance from the closest common ancestor for each member. This data is then sent to getRelByDistance()
to calculate the relationship further.
function getRelByArray(arF, arL, g)
{
g = (g==0||g==1)? g : 2;
var len = (arF.length<arL.length)? arF.length : arL.length;
for (var i=0; i<len; i++){if (arF[i] != arL[i]) break;}
var a = (arF.length - i);
var b = (arL.length - i);
return(this.GetRelByDistance(a,b,g));
}
Step 3: Convert Ancestor Distance into R and C Factors
The getRelByDistance()
function will take in the distance of both members to their closest common ancestor, and calculate the removed (r
) and cousin (c
) factors for the relationship. This is sent to getRelByRC()
, which generates a text name for the relationship.
function getRelByDistance(a,b,g)
{
a = parseInt(a);
b = parseInt(b);
g = (g==0||g==1)? g : 2;
var c = ((b>=a)? a: a - (a - b))
var r = a - b;
return(this.GetRelByRC(c,r,g));
}
Step 4: Generating a Text Name Using R and C
Finally, the getRelByRC()
function will create a text name of the relationship. For most relationships, we simply need to subtract 1 from the c
factor to get something like c3:r1 equals “second cousin once removed”. For the named relationships (zero cousins, etc.). we do a simple lookup to match the c
and r
variables to the predefined names. The gender variable g
is passed so we know when to use aunt instead of uncle and mother instead of father.
function getRelByRC(c,r,g)
{
var strName = "";
var absR;
var g = (g==0||g==1)? g : 2;
var arYou = [["same person","same person","same person"],
["brother","sister","sibling"]];
var arStatic = new Array();
arStatic[0] = [["son","daughter","child"],
["father","mother","parent"]];
arStatic[1] = [["nephew","neice","nephew/neice"],
["uncle","aunt","uncle/aunt"]];
if (c==0) {
absR = Math.abs(r);
if (absR==0) strName = arYou[c][g];
if (absR>0) strName = (r<0)? arStatic[c][0][g] : arStatic[c][1][g];
if (absR>1) strName = " grand " + strName;
if (absR>2) strName = " great " + strName;
if (absR>3) strName = rcFormatPrefix(absR-2) + strName;
}
else if(c==1) {
absR = Math.abs(r);
if (absR==0) strName = arYou[c][g];
if (absR>0) strName = (r<0)? arStatic[c][0][g] : arStatic[c][1][g];
if (absR==2) strName = " great "; + strName;
else{
if (absR>1) strName = "grand" + strName;
if (absR>2) strName = " great " + strName;
if (absR>3) strName = rcFormatPrefix(absR-1) + strName;
}
}
else{
var cousin = rcFormatPrefix(c-1) + " cousin ";
var removed = (Math.abs(r)>0)? Math.abs(r) + " times removed." : "";
strName = cousin + removed;
}
return strName;
}
Using the Code
The functions described above make up the methods of our object. The entire object can be found in the file Rel.txt [4.98 KB]. To calculate a relationship, we simply instantiate the object, passing a reference to our XML tree, then call the GetRelById()
function. The result is a string representation of the relationship.
var relObj = new RelCalculator(xmlDoc);
var strRel = relObj.GetRelById(100, 143);
Our code also includes a UI which renders the XML as a visual tree using XHTML and CSS. The UI allows us to visualize the tree and make calls to the Relationship Calculator through a drag-and-drop interface. I won't go into the specifics of how this page is rendered, but here are the files if you are interested. You can also try out the working example.
History
- 21st October, 2008: Initial post.