Introduction
In this article, I have tried to implement an English Dictionary application using a Ternary Search Tree through a MFC dialog based application, which has an input field and a list of words. It does prefix matching and thus filters out the unmatched words from the list as we type in the input text field. It also does the neighbor search of a particular word, i.e. it gives out the list of near words which closely match the word we have typed in.
For example, in the dictionary application we have a word say “bat”. Now if we type in “bat” and click on the button “More Words”, it will give a list of words like “bat”, “mad”, “mat”, “rat”, “sad” and “sat”, etc. It is understandable that these words should present in the tree structure.
We have another output field called “Meaning”, which will show the meaning of a word typed in the input box.
Explanation of the Code
There are two main classes in the application which have implemented the Dictionary
application. These two classes are CTernarySearchTree
and CTSTNode
.
The Dialog
class actually owns the ternary search tree class which in turn uses the Node
class.
Let’s delve into the Dialog
class. It has got few main functions like OnButton1
, OnChangeEdit1
, OnButton2
and OnButton3
. Actually when this project was initially implemented, I kept a button (Button1
) and on clicking that button I loaded the tree with data. But now I have hidden that button and instead call that function (OnButton1
) inside OnInitDialog
. The function OnButton2
is responsible for displaying the meaning. OnChangeEdit1
is responsible for the prefix matching as we type in. And OnButton3
is the function which shows the words after doing near search.
This is all about the functionality explanation of the Dictionary
application’s front end. The main logic of this application lies into the classes CTSTNode
and CTernarySearchTree
. Let’s discuss these two classes.
CTSTNode
is the class which represents each node in the tree structure. As the tree is a ternary search tree, each of the nodes of the tree has got three subtrees. These are referred to as LOKID
, EQKID
and HIKID
. At the same time, it has got a reference to the original string (which is to be loaded from a text file) as well as its meaning which is also loaded from a text file. The CTSTNode
has another character variable called cSplitChar
.
class CTSTNode
{
public:
friend class CTernarySearchTree;
friend class CEnglishDictionaryDlg;
CTSTNode();
CTSTNode(CTSTNode* p, char* SplitChar)
{
cSplitChar = *SplitChar;
EQKID = p;
originalstring = NULL;
meaning = NULL;
};
virtual ~CTSTNode();
private:
char cSplitChar;
CTSTNode *LOKID, *HIKID, *EQKID, *PARENT;
char* originalstring;
char* meaning;
};
Fig: Class declaration of CTSTNode
While inserting the data into the tree structure, the logic takes one character (say SplitChar
) from the string
(which is to be loaded) and compares it with the current node’s cSplitChar
. If alphabetically SplitChar
comes before cSplitChar
of the current node, the logic will place it in the LOKID
node of the current node. If the SplitChar
comes after cSplitChar
of the current node, the logic will place it in the HIKID
node of the current node. And if the SplitChar
is equal to the cSplitChar
of the current node, the logic will place it in the EQKID
of the current node. And it will repeat the whole process this way. This logic can be seen in the “Insert
” function of the CTernarySearchTree
class which is given below:
if (*SplitChar != '\0')
{
no_of_recursion++;
if(nodeptr == NULL)
{
nodeptr = new CTSTNode(nodeptr, SplitChar);
nodeptr->LOKID = nodeptr->HIKID = nodeptr->EQKID = NULL;
}
if(*SplitChar < nodeptr->cSplitChar)
{
nodeptr->LOKID = Insert(nodeptr->LOKID,SplitChar, meaning);
}
else if (*SplitChar == nodeptr->cSplitChar)
{
nodeptr->EQKID = Insert(nodeptr->EQKID, ++SplitChar, meaning);
}
else
{
nodeptr->HIKID = Insert(nodeptr->HIKID, SplitChar, meaning);
}
}
Fig: Snippet from Insert function
If we study the Insert
function a little more thoroughly, we will be able to understand that once the end of an word (which is being inserted in the tree) is reached , i.e. the ‘\0
’ char is reached, (i.e. when the variable lastnodeinitialized
becomes true
), it will store two references, one for the word itself, and the other for its meaning, inside that node. This can be seen from the code below:
if (*SplitChar == '\0' && no_of_recursion)
{
lastnodeinitialized = TRUE;
no_of_recursion--;
}
if(lastnodeinitialized && nodeptr)
{
nodeptr->originalstring = originalstring;
nodeptr->meaning = meaning;
lastnodeinitialized = FALSE;
no_of_recursion = 0;
originalstring = NULL;
}
Fig: Snippet from Insert function
The CTernarySearchTree
class has other member functions like NearSerch
, Partialmatch
, Search
, Traverse
and Traverse_And_Match
.
Of these, the function NearSearch
does a neighbour search of a particular string
within a certain Hamming distance. We can do it by typing the word “Bat” and by clicking “More Words” button. In the application, we are doing near search within distance 2 as is obvious from the following line of code:
void CEnglishDictionaryDlg::OnButton3()
{
……
test->NearSearch(root,str.GetBuffer(str.GetLength()),2);
……
}
This same function can be used for spell checking.
The function Traverse
traverses the whole tree and fills the main list box which shows all the words.
void CTernarySearchTree::Traverse(CTSTNode* nodeptr)
{
if (!nodeptr) return;
Traverse(nodeptr->LOKID);
if (nodeptr->cSplitChar)
{
Traverse(nodeptr->EQKID);
}
if(nodeptr->originalstring)
{
strList.AddHead(CString(nodeptr->originalstring));
}
Traverse(nodeptr->HIKID);
}
Fig: The Traverse function.
PartialMatch
is the function which is responsible for the filling up of the partially matched words as we type in.
void CTernarySearchTree :: PartialMatch(CTSTNode* nodeptr, char* String)
{
CTSTNode* Found_At = Search(nodeptr , String);
CTSTNode* currentnode = Found_At;
if(!Found_At) return;
Traverse_And_Match(Found_At,String);
}
Fig: The PartialMatch function.
Conclusion
This kind of application can be used for developing any dictionary application for mobile phones. The nearsearch
algorithm can be used for spell checking. The partialmatch
functionality can be used for developing a phone book in a mobile device.
Reference
- The article “Ternary Search Trees” by Jon Bentley and Bob Sedgewick that appeared in Dr. Dobb’s Journal.
History
- 8th April, 2006: Initial post
The best way I can describe myself is as a dream chaser. In the beginning of my career, being in the marketing department of a big telecom company, which hardly added any values to my curiosity, I was hell-bent to jump into the software because that was the only way to know about the nitty-gritty of the hardcore technical aspects. Hence I started with learning C++/VC++. But in the beginning it was really difficult without much idea about programming. Moreover, there was no google. I took a little more time to pick up. There was no training. Absolutely no help from anybody. No broadband internet. No computer at home. It was really difficult for me. But I did not stop dreaming. I used to dream and tell my colleagues that C++ is not as much about programming as about designing. It is more about a technique for moving from the problem domain to the solution domain. However, I hardly got any supports from the organizations where I worked. It was only when I got a PC at home, I started walking towards my goal. The early morning rise, innumerable visits to technical book stores in Bangalore, googling and traversing from one link to another in search for technical and C++ contents, becoming tired after the office hours, all were part of it. But still the road was difficult. I was not able to join the dots. And then when I started going through the Design Pattern book, the actual joy of learning began. Still I remember how I used to go through the MFC source code to map different GoF patterns in Doc-View architecture, the command-routing architecture and so forth. However, I was not much aware of the Open Source communities. Then when Google made their Android framework open, it was a boon for me. I picked up many unknown areas and started looking into code from a designer’s perspective. When i started understanding the Android framework code, I thought I was really able to join the dots.The dots between the dream and the reality to become an able software engineer…..