Introduction
In the first part of this article (Spell Check, Hyphenation, and Thesaurus for .NET with C# and VB Samples - Part 1: Single Threading), the usage of NHunspell in single threaded applications was explained. With locking mechanisms, this technique can be used in multi-threading applications like web servers, ASP.NET WCF services, and so on. But, NHunspell provides an optimized class for these use cases called SpellEngine
for high throughput spell checking services.
Background
Spell checking, hyphenation, and thesaurus functions are based on dictionaries. These dictionaries and the marshalling buffers between .NET and the native DLLs are exclusive resources. The synchronization mechanism prevents the use of these resources by more than one thread. This has a huge performance impact on multi-processor or even multi-core computers. From a performance point of view, it is best when as many Hunspell
, Hyphen
, or MyThes
objects are used as processor cores are available. So each processor core has an object available to process requests. The SpellEngine
class in conjunction with the SpellFactory
class provides this feature out of the box. By default, it instantiates as many objects as processor cores are available. The access to these objects is internally controlled by a Semaphore
.
General Multi Threading Usage of NHunspell
The following code shows the methods of the SpellEngine
and SpellFactory
classes. It is obvious that in real server applications, these objects aren't created and destroyed on every request. They are created at service start and disposed at service end, and all requests are served by the same object (singleton pattern). So it is clear that you will never find this using{}
block in a server application. But this is for demonstration purposes only. How it woks in a real ASP.NET application is explained later.
C# code sample:
using (SpellEngine engine = new SpellEngine())
{
Console.WriteLine();
Console.WriteLine();
Console.WriteLine("Adding a Language with all dictionaries " +
"for Hunspell, Hypen and MyThes");
Console.WriteLine("¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯" +
"¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯");
LanguageConfig enConfig = new LanguageConfig();
enConfig.LanguageCode = "en";
enConfig.HunspellAffFile = "en_us.aff";
enConfig.HunspellDictFile = "en_us.dic";
enConfig.HunspellKey = "";
enConfig.HyphenDictFile = "hyph_en_us.dic";
enConfig.MyThesIdxFile = "th_en_us_new.idx";
enConfig.MyThesDatFile = "th_en_us_new.dat";
Console.WriteLine("Configuration will use " + engine.Processors.ToString() +
" processors to serve concurrent requests");
engine.AddLanguage(enConfig);
Console.WriteLine();
Console.WriteLine("Check if the word 'Recommendation' is spelled correct");
bool correct = engine["en"].Spell("Recommendation");
Console.WriteLine("Recommendation is spelled " +
(correct ? "correct" : "not correct"));
Console.WriteLine();
Console.WriteLine("Make suggestions for the word 'Recommendatio'");
List<string> suggestions = engine["en"].Suggest("Recommendatio");
Console.WriteLine("There are " + suggestions.Count.ToString() +
" suggestions");
foreach (string suggestion in suggestions)
{
Console.WriteLine("Suggestion is: " + suggestion);
}
Console.WriteLine("");
Console.WriteLine("Analyze the word 'decompressed'");
List<string> morphs = engine["en"].Analyze("decompressed");
foreach (string morph in morphs)
{
Console.WriteLine("Morph is: " + morph);
}
Console.WriteLine("");
Console.WriteLine("Find the word stem of the word 'decompressed'");
List<string> stems = engine["en"].Stem("decompressed");
foreach (string stem in stems)
{
Console.WriteLine("Word Stem is: " + stem);
}
Console.WriteLine();
Console.WriteLine("Generate the plural of 'girl' by providing sample 'boys'");
List<string> generated =
engine["en"].Generate("girl", "boys");
foreach (string stem in generated)
{
Console.WriteLine("Generated word is: " + stem);
}
Console.WriteLine();
Console.WriteLine("Get the hyphenation of the word 'Recommendation'");
HyphenResult hyphenated = engine["en"].Hyphenate("Recommendation");
Console.WriteLine("'Recommendation' is hyphenated as: " + hyphenated.HyphenatedWord);
Console.WriteLine("Get the synonyms of the plural word 'cars'");
Console.WriteLine("hunspell must be used to get the word stem 'car' via Stem().");
Console.WriteLine("hunspell generates the plural forms of the synonyms via Generate()");
ThesResult tr = engine["en"].LookupSynonyms("cars", true);
if (tr.IsGenerated)
Console.WriteLine("Generated over stem (The original word " +
"form wasn't in the thesaurus)");
foreach (ThesMeaning meaning in tr.Meanings)
{
Console.WriteLine();
Console.WriteLine(" Meaning: " + meaning.Description);
foreach (string synonym in meaning.Synonyms)
{
Console.WriteLine(" Synonym: " + synonym);
}
}
}
Visual Basic sample:
Using engine As New SpellEngine()
Console.WriteLine()
Console.WriteLine()
Console.WriteLine("Adding a Language with all dictionaries " & _
"for Hunspell, Hypen and MyThes")
Console.WriteLine("¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯" &_
"¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯")
Dim enConfig As New LanguageConfig()
enConfig.LanguageCode = "en"
enConfig.HunspellAffFile = "en_us.aff"
enConfig.HunspellDictFile = "en_us.dic"
enConfig.HunspellKey = ""
enConfig.HyphenDictFile = "hyph_en_us.dic"
enConfig.MyThesIdxFile = "th_en_us_new.idx"
enConfig.MyThesDatFile = "th_en_us_new.dat"
Console.WriteLine("Configuration will use " & _
engine.Processors.ToString() & _
" processors to serve concurrent requests")
engine.AddLanguage(enConfig)
Console.WriteLine()
Console.WriteLine("Check if the word 'Recommendation' is spelled correct")
Dim correct As Boolean = engine("en").Spell("Recommendation")
Console.WriteLine("Recommendation is spelled " & _
(If(correct, "correct", "not correct")))
Console.WriteLine()
Console.WriteLine("Make suggestions for the word 'Recommendatio'")
Dim suggestions As List(Of String) = engine("en").Suggest("Recommendatio")
Console.WriteLine("There are " & suggestions.Count.ToString() & " suggestions")
For Each suggestion As String In suggestions
Console.WriteLine("Suggestion is: " & suggestion)
Next
Console.WriteLine("")
Console.WriteLine("Analyze the word 'decompressed'")
Dim morphs As List(Of String) = engine("en").Analyze("decompressed")
For Each morph As String In morphs
Console.WriteLine("Morph is: " & morph)
Next
Console.WriteLine("")
Console.WriteLine("Find the word stem of the word 'decompressed'")
Dim stems As List(Of String) = engine("en").Stem("decompressed")
For Each stem As String In stems
Console.WriteLine("Word Stem is: " & stem)
Next
Console.WriteLine()
Console.WriteLine("Generate the plural of 'girl' by providing sample 'boys'")
Dim generated As List(Of String) = engine("en").Generate("girl", "boys")
For Each stem As String In generated
Console.WriteLine("Generated word is: " & stem)
Next
Console.WriteLine()
Console.WriteLine("Get the hyphenation of the word 'Recommendation'")
Dim hyphenated As HyphenResult = engine("en").Hyphenate("Recommendation")
Console.WriteLine("'Recommendation' is hyphenated as: " & hyphenated.HyphenatedWord)
Console.WriteLine("Get the synonyms of the plural word 'cars'")
Console.WriteLine("hunspell must be used to get the word stem 'car' via Stem().")
Console.WriteLine("hunspell generates the plural forms of the synonyms via Generate()")
Dim tr As ThesResult = engine("en").LookupSynonyms("cars", True)
If tr.IsGenerated Then
Console.WriteLine("Generated over stem (The original " &_
"word form wasn't in the thesaurus)")
End If
For Each meaning As ThesMeaning In tr.Meanings
Console.WriteLine()
Console.WriteLine(" Meaning: " & meaning.Description)
For Each synonym As String In meaning.Synonyms
Console.WriteLine(" Synonym: " & synonym)
Next
Next
End Using
Spell Checking in ASP.NET
The following sample shows how to integrate a SpellEngine in an ASP.NET application. A real ASP.NET application using this library for spell checking, hyphenation, and thesaurus is: Spell Check, Hyphenation, and Thesaurus Online.
A working sample can be downloaded from SourceForge; look at the Resources section below for the link.
First of all, we need a globally accessible SpellEngine
to serve our requests. Therefore, we include a 'Global.asax' in our website and include a static instance of SpellEngine
in the application object:
public class Global : System.Web.HttpApplication
{
static SpellEngine spellEngine;
static public SpellEngine SpellEngine { get { return spellEngine; } }
...
}
After that, we initialize this object in the Application_Start
event and dispose it in the Application_End
event in the Global
class:
protected void Application_Start(object sender, EventArgs e)
{
try
{
string dictionaryPath = Server.MapPath("Bin") + "\\";
Hunspell.NativeDllPath = dictionaryPath;
spellEngine = new SpellEngine();
LanguageConfig enConfig = new LanguageConfig();
enConfig.LanguageCode = "en";
enConfig.HunspellAffFile = dictionaryPath + "en_us.aff";
enConfig.HunspellDictFile = dictionaryPath + "en_us.dic";
enConfig.HunspellKey = "";
enConfig.HyphenDictFile = dictionaryPath + "hyph_en_us.dic";
enConfig.MyThesIdxFile = dictionaryPath + "th_en_us_new.idx";
enConfig.MyThesDatFile = dictionaryPath + "th_en_us_new.dat";
spellEngine.AddLanguage(enConfig);
}
catch (Exception ex)
{
if (spellEngine != null)
spellEngine.Dispose();
}
}
protected void Application_End(object sender, EventArgs e)
{
if( spellEngine != null )
spellEngine.Dispose();
spellEngine = null;
}
After that, we can access our SpellEnginge
from our ASPX pages. For example:
<%@ Page Language="C#" AutoEventWireup="true"
CodeBehind="Default.aspx.cs" Inherits="WebSampleApplication._Default" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" >
<head runat="server">
<title></title>
</head>
<body>
<form id="form1" runat="server">
<div>
<asp:TextBox ID="QueryText" runat="server"></asp:TextBox>
<asp:Button ID="SubmitButton" runat="server" Text="Search" />
<br />
<asp:Literal ID="ResultHtml" runat="server" />
</div>
</form>
</body>
</html>
protected void Page_Load(object sender, EventArgs e)
{
if (Page.IsPostBack)
{
string queryText = QueryText.Text;
bool correct = Global.SpellEngine["en"].Spell(queryText);
string result = "<br />";
if (correct)
{
result += Server.HtmlEncode(queryText) + " is correct.<br />Synonyms:<br />";
ThesResult meanings = Global.SpellEngine["en"].LookupSynonyms(queryText,true);
if (meanings != null)
{
foreach (ThesMeaning meaning in meanings.Meanings)
{
result += "<b>Meaning: " +
Server.HtmlEncode(meaning.Description) + "</b><br />";
int number = 1;
foreach (string synonym in meaning.Synonyms)
{
result += number.ToString() + ": " +
Server.HtmlEncode(synonym) + "<br />";
++number;
}
}
}
}
else
{
result += Server.HtmlEncode(queryText) +
" is not correct.<br /><br />Suggestions:<br />";
List<string> suggestions = Global.SpellEngine["en"].Suggest(queryText);
int number = 1;
foreach (string suggestion in suggestions)
{
result += number.ToString() + ": " +
Server.HtmlEncode(suggestion) + "<br />";
++number;
}
}
ResultHtml.Text = result;
}
}
Use in Commercial Applications and Available Dictionaries
Due to the LGPL and MPL licenses, NHunspell can be used in commercial applications. It is allowed to link against the NHunspell.dll assembly in closed source projects. NHunspell uses the Open Office dictionaries; most of these dictionaries are available for free. The use of NHunspell in commercial / closed source applications is permitted.
Resources
The Open Office '.oxt' extensions are in fact Zip files. To use them with NHunspell, unzip the dictionaries they contain.
Important: Check the dictionary license before you use it!
Works with NHunspell too.