Introduction
This is one of my projects that has a program to search on a text file. Assume that you have a set of text files stored somewhere in the hard disk. You want to find some text files, but you don't remember the file name. However, you know the content that you're looking for so that you have some keywords to search for. This is like the search function of Windows.
Background
Some of the requirements are:
- Create the
FileList
: Create a text file named FileList
to store all of the text file paths. Each line of this file is a file path. Every line has an ID to identify the file path. The ID number starts at 0. - Indexing: Scan all text files and store each word into a Binary Search Tree for searching quickly. Every node in the tree contains a word, a list of ID numbers, and left and right pointers.
- Display: Only output a little portion of the text files that contain the keywords and the ID to know which file was searched.
Using the Code
To create the FileList
, I use the CStdioFile
class:
CStdioFile file;
file.Open("FileList.txt",CFile::modeCreate|CFile::modeReadWrite);
CFileFind Finder;
BOOL bWorking = Finder.FindFile(m_PATH + "\\*.txt"); while(bWorking)
{
bWorking = Finder.FindNextFile();
if (!Finder.IsDirectory())
{
file.WriteString(Finder.GetFilePath()); file.WriteString("\n");
}
}
file.Close();
For searching, I use a Binary Search Tree to store the words. Firstly, I scan the directory stores text files to create FileList
. Then, open every text file in FileList
to scan for words. Every word is stored in the BST. A word can have many IDs, so I use a Linear Linked List to store the ID numbers.
ListID* CTinyGoogleDlg::SearchWord(string key)
{
tree* current;
ListID *tmp = NULL;
if (head)
{
current = head;
while (current)
{
if (strcmp(current->word,key) == 0)
break;
else
if (strcmp(current->word,key) < 0)
current = current->right;
else
if (strcmp(current->word,key) > 0)
current = current->left;
}
}
else
MessageBox("Something's wrong!");
if (!current)
return tmp;
else
return current->IDs;
}
Then, ask the user to input keywords to search. Search on the Binary Tree to find whether the keywords exist or not. If yes, use the ID to open the text file. Then, print out some lines of the text file in the result.
int CTinyGoogleDlg::Display(ListID *curr)
{
CStdioFile file;
CString sText;
m_RESULT = "";
if (curr)
{
if (file.Open("FileList.TXT",CFile::modeRead))
{
int count = -1;
while (curr)
{
CString path;
do
{
file.ReadString(path);
count += 1;
}while(count < curr->ID);
CString DocID;
DocID.Format("%d",curr->ID);
m_RESULT = m_RESULT + "\r\nDocID:" + DocID + "\r\n";
CStdioFile read;
read.Open(path,CFile::modeRead);
for (short nLineCount = 0; nLineCount < 16; nLineCount++)
{
read.ReadString(sText);
m_RESULT = m_RESULT + sText + "\r\n";
}
GetDlgItem(IDC_EDIT_RESULT)->SetWindowText(m_RESULT);
read.Close();
curr = curr->next;
}
}
file.Close();
}
else
MessageBox("NOT FOUND!");
return 0;
}
Points of Interest
In the beginning, I met with some trouble on how to find the file paths. This wasn't very difficult, but at my level, it's not very easy. However, I found some ways on the Internet, and CodeProject helped me very much. Now, I am sharing my little program with others.
History
The first version of this program was written as a Win32 console app. This version is an MFC app.