Introduction
I was looking for a good spell checker and hyphenation library for .NET, and I found the free (LGPL licensed) Hunspell spell checker and Hyphen libraries used in OpenOffice. Hunspell wasn't available for the .NET platform. So, I decided to write a wrapper/port. It is quite nice that a lot of the OpenOffice dictionaries are LGPL licensed too and can be used in proprietary applications.
Interop code to the native Hunspell functions
I used Managed C++ to write the wrapper/port, because I could use the original source code of Hunspell and Hyphen. It was quite nice to write the interop code between managed classes and the unmanaged Hunspell and Hyphen libraries. The original source code is almost unchanged, so that new versions of Hunspell or Hyphen can be easily adopted. Hunspell and Hyphen use unmanaged memory functions, so I had to implement the IDisposable
interface and used this pattern to free unmanaged memory early.
Because Hunspell uses UTF8 coding, I had to provide conversion functions from/to UTF8:
char * NHunspell::MarshalHelper::AllocUTF8FromString(String ^value)
{
array<Byte> ^ byteArray = Encoding::UTF8->GetBytes(value);
int size = Marshal::SizeOf(byteArray[0]) * (byteArray->Length + 1);
IntPtr buffer = Marshal::AllocHGlobal(size);
Marshal::Copy(byteArray, 0, buffer, byteArray->Length);
Marshal::WriteByte(buffer, size - 1, 0);
return (char *) buffer.ToPointer();
}
String ^ NHunspell::MarshalHelper::AllocStringFromUTF8( char * value )
{
int size = strlen(value);
array<Byte> ^ byteArray = gcnew array<Byte>(size);
Marshal::Copy(IntPtr(value), byteArray, 0, size);
return Encoding::UTF8->GetString(byteArray);
}
Another big thing is to handle the unmanaged memory. I implement destructors and finalizers to deal with this:
NHunspell::Hunspell::~Hunspell()
{
this->!Hunspell();
}
NHunspell::Hunspell::!Hunspell()
{
if( handle != 0 )
{
delete handle;
handle = 0;
}
}
bool NHunspell::Hunspell::IsDisposed::get()
{
return handle == 0;
}
NHunspell spell checking and hyphenation sample
This is a short demo of how to use NHunspell for spell checking, suggestions, and hyphenation:
Console.WriteLine("NHunspell functions and classes demo");
Console.WriteLine("");
Console.WriteLine("Spell Check with with Hunspell");
using (Hunspell hunspell = new Hunspell("en_us.aff", "en_us.dic"))
{
Console.WriteLine("Check if the word 'Recommendation' is spelled correct");
bool correct = hunspell.Spell("Recommendation");
Console.WriteLine("Recommendation is spelled " +
(correct ? "correct" : "not correct"));
Console.WriteLine("");
Console.WriteLine("Make suggestions for the word 'Recommendatio'");
List<string> suggestions = hunspell.Suggest("Recommendatio");
Console.WriteLine("There are " + suggestions.Count.ToString() +
" suggestions" );
foreach (string suggestion in suggestions)
{
Console.WriteLine("Suggestion is: " + suggestion );
}
}
Console.WriteLine("");
Console.WriteLine("Hyphenation with Hyph");
using (Hyphen hyphen = new Hyphen("hyph_en_us.dic"))
{
Console.WriteLine("Get the hyphenation of the word 'Recommendation'");
HyphenResult hyphenated = hyphen.Hyphenate("Recommendation");
Console.WriteLine("'Recommendation' is hyphenated as: " +
hyphenated.HyphenatedWord );
}
Console.WriteLine("");
Console.WriteLine("Press any key to continue...");
Console.ReadKey();
Because Hunspell is native C++ code, you must include the correct assembly for your platform. On x86 platforms (32 bit), use the NHunspell.dll from the X86 folder. On X64 platforms, use the NHunspell.dll from the X64 folder.