Download demo project - 85 Kb
Download the current US English Dictionary - 820 Kb
Description:
This project is an evolution of the spell
checking engine project I submitted earlier. This project includes
numerous enhancements to the core spelling engine plus the addition of a
"check-as-you-type" edit control and the related support dialogs (see
above).
This project is not complete, it is a
work-in-progress. There are numerous issues with the current version which
need to be addressed. My long term goal is to develop this to
"commercial quality".
I am going to continue to improve this engine
toward my goal. I will continue to post updates as I feel necessary.
Changes from previous version:
- Reorganized class architecture. Added
CFPSSpellCheckEngineOptions
, CFPSDictionary
to core engine.
- Created
CFPSSpellingEditCtrl
CEdit
derived
class to implement "check-as-you-type" edit control.
- Created options property pages (see above)
- Created spelling dialog (see above)
- Created common-use dictionary. (Download
available above)
- Updated US English dictionary w/improved word
list + proper names.
- Incremental changes to the
MetaphoneEx
function.
- Addition of the
EditDistance
function.
- Added support for case-sensitive dictionary
entries.
- Added file header to dictionaries.
To-do-list:
- Re-write
EditDistance
algorithm for
performance.
- Research compression options for dictionaries
- Create ATL ActiveX control from edit control
- Create COM based dictionary support for
language independence
- Create Rich Edit
"check-as-you-type" control.
- Add auto-correct capability.
- Add sentence begin recognition for automatic
upper case decisions.
- Continue to improve US English dictionary.
- Implement a binary-search mechanism on
dictionary look-ups.
- Continue to improve
MetaphoneEx
function.
- Create C# (.net) version (when C# stabilizes)
- ETC...
Classes:
CFPSSpellCheckEngine |
This is the core spelling engine. It is intended to
be language independent. (not currently). This engine encapsulates
the functionality of managing dictionaries, making suggestions (through
dictionaries), and maintaining spelling options. |
CFPSSpellCheckEngineOptions |
Support class for CFPSSpellCheckEngine which implements
support for storing, saving and loading spell checking options.
Currently, this uses a serialized file to store options, but could
easily be changed to INI file or registry. |
CFPSDictionary |
Base dictionary class. Defines a set of virtual
functions generic to all dictionaries. Also, provides base
implementation of all virtual functions based on current
requirements. This class uses a defined file structure w/ a file
header and any number of dictionary records. Future derivations of
this class will provide language specific support. |
CDlgSpellChecker |
CDialog derived class which implements the spell checker
dialog. Currently the undo support is based on edit control undo
support, not spell checker undo. Need to improve this further. |
CPrShtSpellOptions |
CPropertySheet derived class which implements the spell
checking engines property sheet. |
CPrPgeSpellOptions_General |
CPropertyPage derived class implementing general options
panel. |
CPrPgeSpellOptions_User |
CPropertyPage derived class implementing user dictionary
options panel. |
CPrPgeSpellOptions_Common |
CPropertyPage derived class implementing common misspellings
options panel. |
Support functions of importance:
void CheckSpellingEdit (CFPSSpellCheckEngine* pEngine, CEdit* pEdit)
|
This function is called when a
user presses the F7 (or configured) hot key from within the
"check-as-you-type" edit control. It displays the
CDlgSpellChecker dialog box. |
void CheckSpellingRich (CFPSSpellCheckEngine* pEngine, CRichEditCtrl* pEdit)
|
This function is called when a
user presses the F7 (or configured) hot key from within the
"check-as-you-type" edit control. It displays the
CDlgSpellChecker dialog box.
NOTE: This function is not currently being used because the rich edit
control is not complete. |
int EditDistance(const char *szWord1, const char *szWord2) |
This function is passed in 2
words and returns an approximation of the minimum number of changes a
user would need to make to make the 2 words match. This function
is not a true edit-distance algorithm, but is a customized algorithm for
this spell checking application. |
void MetaphoneEx(const char *szInput, char *szOutput, int
iMaxLen) |
This function is passed a word
and it returns (through the szOutput parameter) a modified-metaphone
representation of the word. This is a variation on the algorithm
originally wrote by "http://www.cuj.com/archive/1806/feature.html">Lawrence
Philips. A newer version of his algorithm
(double-metaphone) is also available. I have tested this algorithm
with the spell checking engine and was not impressed with the
results. It does provide fast results and a high hit-rate, but it
also returns far too many results (on average). However, I am
considering using it in conjunction with the EditDistance
algorithm and will further review this. |
void SortMatches(LPCSTR lpszBadWord, CStringList &Matches) |
This function sorts a list of
word suggestions based on the approximate edit-distance between the
words in the list and the misspelled word based in as lpszBadWord. |
Architecture:
CORE ENGINE
The core spell checking engine consists of the three classes: CFPSSpellCheckEngine
,
CFPSSpellCheckEngineOptions
and CFPSDictionary
. These classes provide
support for dictionary related functions such as add a word, remove a word,
ignore a word, load dictionary, save dictionary, Is a word in the dictionary,
suggest possible matches, etc.
The core engine is implemented as a strict
back-end engine. It has no user-interface components. Most of the
functions exposed by these classes where an error might occur return an int
return code. These return codes are defined in 1) FPSSpellCheckerInclude.h
and 2) the header file for a given class. The return codes should always
be examined to determine the completion status of these functions.
Special care has been taken to insure that
these classes are very stable and robust. Also, performance
considerations weigh heavy on the implementation of these classes. Very
little MFC code is used in these classes and functions.
CHECK-AS-YOU-TYPE EDIT CONTROL
The check-as-you-type edit control is contained in the CFPSSpellingEditCtrl
class. It is derived off of CEdit
and works by subclassing an existing
edit control through the AttachEdit
function.
To improve performance, this control implements
a timer and whenever there is no user activity (typing, mouse clicking,
scrolling, etc) checks the spelling of the displayed portion of the edit
control. The function RedrawSpellingErrors
is called to perform the
checking. It checks only the displayed portion of the edit control and
calls DrawSpellingError
for each displayed word. If a word is not found
in the dictionary, this function calls DrawSquiglyI
to draw the squigly
underline for the word. DrawSquigly
creates a structure of type FPSSPELLEDIT_ERRORS
and adds it to the m_SpellingErrors member list.
The OnRButtonDown
function checks the m_SpellingErrors
to determine when to display the normal popup menu and when to display the
spell check popup menu. Suggestions returned from the core engine are
sorted using the SortMatches
function to display them in order of
edit-distance.
The PreTranslateMessage
checks for a hot key
(defaults to F7). This can be customized by calling the SetHotKey
static member function. When the hot key is pressed the CheckSpellingEdit
function is called to display the spell checking dialog box.
SPELL CHECK DIALOG BOX
The spell checking dialog box is implemented in the CDlgSpellChecker
class. This is a standard CDialog
derived class based on the IDD_SPELL_CHECK
dialog resource.
The spell checking dialog is modelled after the
Microsoft Word implementation of spell checking. It is laid out the same
and functions (for the most part) the same. This dialog searches an edit
control (or rich edit control) for sentences misspelled words and displays the
sentence with the misspelled word highlighted.
Suggestions returned from the core engine are
sorted using the SortMatches
function to display them in order of
edit-distance.
How to use the demo:
- Unzip the provided file into a directory (be
sure to extract the sub directories.)
- Make sure that the USMain.dic file is in the
\Release directory.
- Make sure that the USCommon.dic file is in the
\Release directory.
- Execute the FPSSpellChecker.exe from the
\Release directory.
How to incorporate the spell checker into an
application:
- In your applications
InitInstance
function,
add a call to CFPSSpellingEditCtrl::InitSpellingEngine(NULL)
static member
function; OR, instead of NULL
, pass in a string containing a fully qualified
path to a spell checking engine options file.
- In your applications
ExitInstance
function,
add a call to CFPSSpellingEditCtrl::Terminate
static member function
- Add the following files to your
project.
DlgSpellChecker.cpp |
DlgSpellChecker.h |
DlgSpellingEditCtrl.cpp |
DlgSpellingEditCtrl.h |
FPSDictionary.cpp |
FPSDictionary.h |
FPSSpellCheckEngine.cpp |
FPSSpellCheckEngine.h |
FPSSpellCheckEngineOptions.cpp |
FPSSpellCheckEngineOptions.h |
FPSSpellCheckerInclude.cpp |
FPSSpellCheckerInclude.h |
FPSSpellingEditCtrl.cpp |
FPSSpellingEditCtrl.h |
PrPgeSpellOptions_Common.cpp |
PrPgeSpellOptions_Common.h |
PrPgeSpellOptions_General.cpp |
PrPgeSpellOptions_General.h |
PrPgeSpellOptions_User.cpp |
PrPgeSpellOptions_User.h |
PrShtSpellOptions.cpp |
PrShtSpellOptions.h |
- Copy the following resource items to your
project.
IDD_SPELL_CHECK |
|
IDD_SPELL_OPTION_COMMON |
|
IDD_SPELL_OPTION_GENERAL |
|
IDD_SPELL_OPTION_USER |
|
- Include the "FPSSpellCheckerInclude.h"
file in your stdafx.h file.
#include "FPSSpellCheckerInclude.h" |
|
- Place a standard edit control on a form or
dialog resource and give it a unique control id (ie.
ID_TEST_EDIT
)
- Add a member variable of type
CFPSSpellingEditCtrl
to the dialog/form class file (ie. m_editTest)
- In the
OnInitDialog
function, call the
AttachEdit
member function of CFPSSpellingEditCtrl
(ie. m_editTest.AttachEdit(this,
ID_TEST_EDIT);
Known Issues
- Performance is still not as good as it needs
to be.
- Language support is limited to US English.
- The
EditDistance
function needs work.
- The
MetaphoneEx
function needs work.
- There is a painting problem with the edit
control when scrolling the control while the spelling error "squigly"
lines are displayed.
- No complete support for rich edit control.