Introduction
A permuted index, also called a keyword-in-context index, is an alphabetical list of keywords displayed with their surrounding text (context). The screenshot above shows an example. The display format makes it easy to scan for keywords of interest, the context helps to identify the particular instance when a keyword appears multiple times, and the keywords themselves can serve as links for direct access to the text or objects to which they refer.
Background
The most well-known keyword index is the alphabetical index that appears in the back of a book, which lists keywords along with the numbers of the pages on which the keywords appear. The reader using the index must examine the listed pages until he finds the particular reference he seeks. The keyword-in-context index makes it easy to find the desired reference by including the surrounding text as well as the keywords. For readability, it is formatted so that the keywords align in a vertical column. This format may be familiar from its use in printed documentation for the Unix operating system.
This indexing approach is especially applicable to a group of entities that have one-line descriptions: the description line itself serves as the context. Descriptions of functions in a library, descriptions of photos in a collection, short quotations, or proverbs are all suitable examples.
Using the code
The code contains the indexer itself, CPermutedIndex
, and a dialog based, App-Wizard generated, sample application that accepts user-entered strings, builds an index from those strings, and displays the index in a list box and an alphabetical "thumb-index" window. The sample application can also write the index to a file as HTML, or as C source code for use in another project.
Building the index
To build the index, construct a CPermutedIndex
object, and call its BuildIndex
method with an array of the strings to be indexed.
m_pPermutedIndex = new CPermutedIndex;
m_pPermutedIndex->BuildIndex(m_pInputLinePtrs, m_nInputLineCount);
Note that the index object does not copy the strings, so the array of strings must persist as long as the index is needed.
Excluding words from the index
Some words add only clutter to the index. Articles and connectives such as the, and, of. Context-specific filler words such as avenue, street, gram, kilometer, volt, second. To exclude your set of words, call SetExcludedWordList
with an array of strings before building the index.
static char szExcludedWords[] =
{
"the", "and", "of", "avenue", "street", "gram",
"kilometer", "volt", "second"
};
...
m_pPermutedIndex = new CPermutedIndex;
m_pPermutedIndex->SetExcludedWordList(szExcludedWords,
sizeof(szExcludedWords)/sizeof(char *));
m_pPermutedIndex->BuildIndex(m_pInputLinePtrs, m_nInputLineCount);
Using the index
To use the index with a Windows list box, create the list box with the LBS_OWNERDRAWFIXED
style. Derive a class from CListBox
, and override its DrawItem
method to call DrawListBoxItem
in your CPermutedIndex
object. Then, call FillIndexListBox
in your CPermutedIndex
object after building the index.
class CIndexListBox : public CListBox
{
public:
CPermutedIndexDemoDlg *m_pParentDlg;
protected:
afx_msg void DrawItem(LPDRAWITEMSTRUCT lpDrawItemStruct);
};
void CIndexListBox::DrawItem(DRAWITEMSTRUCT *pDiS)
{
m_pParentDlg->m_pPermutedIndex->DrawListBoxItem(pDiS);
}
To respond to the user selecting an entry in the list box, add an event handler for ON_LBN_SELCHANGE
, and call GetIndexTableEntry
in your CPermutedIndex
object.
void CPermutedIndexDemoDlg::OnSelchangeIndexList()
{
int nCurSel = m_wndIndexList.GetCurSel();
int nEntry = m_wndIndexList.GetItemData(nCurSel);
const IndexEntry *pEntry = m_pPermutedIndex->GetIndexTableEntry(nEntry);
int nSourceLine = pEntry->nItemIndex;
}
At this point, you need to know what an IndexEntry
struct looks like:
typedef struct _IndexEntry
{
int nItemIndex;
const char *pszText;
short nKeywordOffset;
short nKeywordLength;
} IndexEntry;
from which all the information may be extracted to display the index and to relate an entry to its source string.
Thumb-index tab window
The "thumb-index tab" window displays the letters of the alphabet. Clicking on a letter returns the number of the first index entry that begins with that letter; the application can then scroll the list box to that entry by setting the list box selection to that entry.
To implement the thumb-index tab window, place a white rectangle static control on your dialog, and give it the SS_NOTIFY
style. Derive a class from CStatic
, and attach an instance of this class to the control. Implement the derived class as follows:
class CIndexTabWin : public CStatic
{
public:
CPermutedIndexDemoDlg *m_pParentDlg;
protected:
afx_msg void OnPaint();
afx_msg void OnLButtonDown(UINT nType, CPoint point);
DECLARE_MESSAGE_MAP()
};
BEGIN_MESSAGE_MAP(CIndexTabWin, CStatic)
ON_WM_PAINT()
ON_WM_LBUTTONDOWN()
END_MESSAGE_MAP()
void CIndexTabWin::OnPaint()
{
PAINTSTRUCT ps;
CDC *pDC = BeginPaint(&ps);
m_pParentDlg->m_pPermutedIndex->DrawIndexTabWindow(this, pDC);
EndPaint(&ps);
}
void CIndexTabWin::OnLButtonDown(UINT nType, CPoint point)
{
m_pParentDlg->ProcessTabWindowClick(point.x);
}
To link the tab window to the list box, you'll need code like this:
void CPermutedIndexDemoDlg::ProcessTabWindowClick(int nXPos)
{
int nEntry = m_pPermutedIndex->IndexOffsetFromTabXCoord(nXPos);
if (nEntry >= 0)
m_wndIndexList.SetCurSel(nEntry);
}
Exporting the index for other uses
The information needed to draw the index and the tab window and to link them appropriately is all available from the CPermutedIndex
object. The demo application contains code to export the index as linked HTML or as C-language arrays and structs. Examining this application should provide insight into how to use the index.
Points of interest
This indexer was originally written as part of a viewer for a collection of historical photographs, each described by a line of text. That application could also export the index as HTML with links to the image files. Experience with the index suggested that the indexing capability might be useful elsewhere.
Displaying the permuted index in a Windows list box gives an education in the use of owner-drawn list boxes. This implementation draws the text up to the keyword, using the TA_RIGHT
text alignment to the horizontal center of the drawing rectangle, then draws the keyword and any text to its right using the TA_LEFT
alignment at the center. The keyword is drawn in a different color and, optionally, in a different font; the demo application uses a slightly bold font to make the keyword stand out.
Displaying the index in HTML uses a similar approach. The index is created as a two-column table with invisible borders. The left column shows text up to the keyword and is right-justified; the right column shows the keyword and its following text, left-justified. The keyword is given a link to the appropriate target. The result appears as a single line of text, with the keywords aligned vertically.
The HTML thumb-index tabs are created as a series of links (for active letters) or strings (for inactive letters). Active entries link to the corresponding entry in the index, so those targets are created when the index is written, using the GetEntrySectionNumber
method to get the tag, if any, associated with each index entry.
History
- September 15, 2006: created.
- September 20, 2006: fixed to compile with VC7.
- October 2, 2006: added text description of
SetExcludedWordList
.