Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / ASP.NET

Spell Check, Hyphenation, and Thesaurus for .NET with C# and VB Samples - Part 2: Multi Threading

4.91/5 (21 votes)
24 Jul 2014LGPL33 min read 86.2K  
NHunspell (Open Office spell checker for .NET) functions for servers and ASP.NET web projects.

Introduction

In the first part of this article (Spell Check, Hyphenation, and Thesaurus for .NET with C# and VB Samples - Part 1: Single Threading), the usage of NHunspell in single threaded applications was explained. With locking mechanisms, this technique can be used in multi-threading applications like web servers, ASP.NET WCF services, and so on. But, NHunspell provides an optimized class for these use cases called SpellEngine for high throughput spell checking services.

Background

Spell checking, hyphenation, and thesaurus functions are based on dictionaries. These dictionaries and the marshalling buffers between .NET and the native DLLs are exclusive resources. The synchronization mechanism prevents the use of these resources by more than one thread. This has a huge performance impact on multi-processor or even multi-core computers. From a performance point of view, it is best when as many Hunspell, Hyphen, or MyThes objects are used as processor cores are available. So each processor core has an object available to process requests. The SpellEngine class in conjunction with the SpellFactory class provides this feature out of the box. By default, it instantiates as many objects as processor cores are available. The access to these objects is internally controlled by a Semaphore.

General Multi Threading Usage of NHunspell

The following code shows the methods of the SpellEngine and SpellFactory classes. It is obvious that in real server applications, these objects aren't created and destroyed on every request. They are created at service start and disposed at service end, and all requests are served by the same object (singleton pattern). So it is clear that you will never find this using{} block in a server application. But this is for demonstration purposes only. How it woks in a real ASP.NET application is explained later.

C# code sample:
C#
using (SpellEngine engine = new SpellEngine())
{

    Console.WriteLine();
    Console.WriteLine();
    Console.WriteLine("Adding a Language with all dictionaries " + 
                      "for Hunspell, Hypen and MyThes");
    Console.WriteLine("¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯" + 
                      "¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯");
    LanguageConfig enConfig = new LanguageConfig();
    enConfig.LanguageCode = "en";
    enConfig.HunspellAffFile = "en_us.aff";
    enConfig.HunspellDictFile = "en_us.dic";
    enConfig.HunspellKey = "";
    enConfig.HyphenDictFile = "hyph_en_us.dic";
    enConfig.MyThesIdxFile = "th_en_us_new.idx";
    enConfig.MyThesDatFile = "th_en_us_new.dat";
    Console.WriteLine("Configuration will use " + engine.Processors.ToString() + 
                      " processors to serve concurrent requests");
    engine.AddLanguage(enConfig);

    Console.WriteLine();
    Console.WriteLine("Check if the word 'Recommendation' is spelled correct");
    bool correct = engine["en"].Spell("Recommendation");
    Console.WriteLine("Recommendation is spelled " + 
       (correct ? "correct" : "not correct"));


    Console.WriteLine();
    Console.WriteLine("Make suggestions for the word 'Recommendatio'");
    List<string> suggestions = engine["en"].Suggest("Recommendatio");
    Console.WriteLine("There are " + suggestions.Count.ToString() + 
                      " suggestions");
    foreach (string suggestion in suggestions)
    {
        Console.WriteLine("Suggestion is: " + suggestion);
    }

    Console.WriteLine("");
    Console.WriteLine("Analyze the word 'decompressed'");
    List<string> morphs = engine["en"].Analyze("decompressed");
    foreach (string morph in morphs)
    {
        Console.WriteLine("Morph is: " + morph);
    }

    Console.WriteLine("");
    Console.WriteLine("Find the word stem of the word 'decompressed'");
    List<string> stems = engine["en"].Stem("decompressed");
    foreach (string stem in stems)
    {
        Console.WriteLine("Word Stem is: " + stem);
    }

    Console.WriteLine();
    Console.WriteLine("Generate the plural of 'girl' by providing sample 'boys'");
    List<string> generated = 
       engine["en"].Generate("girl", "boys");
    foreach (string stem in generated)
    {
        Console.WriteLine("Generated word is: " + stem);
    }

    Console.WriteLine();
    Console.WriteLine("Get the hyphenation of the word 'Recommendation'");
    HyphenResult hyphenated = engine["en"].Hyphenate("Recommendation");
    Console.WriteLine("'Recommendation' is hyphenated as: " + hyphenated.HyphenatedWord);


    Console.WriteLine("Get the synonyms of the plural word 'cars'");
    Console.WriteLine("hunspell must be used to get the word stem 'car' via Stem().");
    Console.WriteLine("hunspell generates the plural forms of the synonyms via Generate()");
    ThesResult tr = engine["en"].LookupSynonyms("cars", true);
    if (tr.IsGenerated)
        Console.WriteLine("Generated over stem (The original word " + 
                          "form wasn't in the thesaurus)");
    foreach (ThesMeaning meaning in tr.Meanings)
    {
        Console.WriteLine();
        Console.WriteLine("  Meaning: " + meaning.Description);

        foreach (string synonym in meaning.Synonyms)
        {
            Console.WriteLine("    Synonym: " + synonym);

        }
    }
}
Visual Basic sample:
VB
Using engine As New SpellEngine()

    Console.WriteLine()
    Console.WriteLine()
    Console.WriteLine("Adding a Language with all dictionaries " & _ 
                      "for Hunspell, Hypen and MyThes")
    Console.WriteLine("¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯" &_
                      "¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯")
    Dim enConfig As New LanguageConfig()
    enConfig.LanguageCode = "en"
    enConfig.HunspellAffFile = "en_us.aff"
    enConfig.HunspellDictFile = "en_us.dic"
    enConfig.HunspellKey = ""
    enConfig.HyphenDictFile = "hyph_en_us.dic"
    enConfig.MyThesIdxFile = "th_en_us_new.idx"
    enConfig.MyThesDatFile = "th_en_us_new.dat"
    Console.WriteLine("Configuration will use " & _
                      engine.Processors.ToString() & _
                      " processors to serve concurrent requests")
    engine.AddLanguage(enConfig)

    Console.WriteLine()
    Console.WriteLine("Check if the word 'Recommendation' is spelled correct")
    Dim correct As Boolean = engine("en").Spell("Recommendation")
    Console.WriteLine("Recommendation is spelled " & _
                     (If(correct, "correct", "not correct")))


    Console.WriteLine()
    Console.WriteLine("Make suggestions for the word 'Recommendatio'")
    Dim suggestions As List(Of String) = engine("en").Suggest("Recommendatio")
    Console.WriteLine("There are " & suggestions.Count.ToString() & " suggestions")
    For Each suggestion As String In suggestions
        Console.WriteLine("Suggestion is: " & suggestion)
    Next

    Console.WriteLine("")
    Console.WriteLine("Analyze the word 'decompressed'")
    Dim morphs As List(Of String) = engine("en").Analyze("decompressed")
    For Each morph As String In morphs
        Console.WriteLine("Morph is: " & morph)
    Next

    Console.WriteLine("")
    Console.WriteLine("Find the word stem of the word 'decompressed'")
    Dim stems As List(Of String) = engine("en").Stem("decompressed")
    For Each stem As String In stems
        Console.WriteLine("Word Stem is: " & stem)
    Next

    Console.WriteLine()
    Console.WriteLine("Generate the plural of 'girl' by providing sample 'boys'")
    Dim generated As List(Of String) = engine("en").Generate("girl", "boys")
    For Each stem As String In generated
        Console.WriteLine("Generated word is: " & stem)
    Next

    Console.WriteLine()
    Console.WriteLine("Get the hyphenation of the word 'Recommendation'")
    Dim hyphenated As HyphenResult = engine("en").Hyphenate("Recommendation")
    Console.WriteLine("'Recommendation' is hyphenated as: " & hyphenated.HyphenatedWord)


    Console.WriteLine("Get the synonyms of the plural word 'cars'")
    Console.WriteLine("hunspell must be used to get the word stem 'car' via Stem().")
    Console.WriteLine("hunspell generates the plural forms of the synonyms via Generate()")
    Dim tr As ThesResult = engine("en").LookupSynonyms("cars", True)
    If tr.IsGenerated Then
        Console.WriteLine("Generated over stem (The original " &_ 
                          "word form wasn't in the thesaurus)")
    End If
    For Each meaning As ThesMeaning In tr.Meanings
        Console.WriteLine()
        Console.WriteLine("  Meaning: " & meaning.Description)

        For Each synonym As String In meaning.Synonyms

            Console.WriteLine("    Synonym: " & synonym)
        Next

    Next
End Using

Spell Checking in ASP.NET

The following sample shows how to integrate a SpellEngine in an ASP.NET application. A real ASP.NET application using this library for spell checking, hyphenation, and thesaurus is: Spell Check, Hyphenation, and Thesaurus Online.

A working sample can be downloaded from SourceForge; look at the Resources section below for the link.

First of all, we need a globally accessible SpellEngine to serve our requests. Therefore, we include a 'Global.asax' in our website and include a static instance of SpellEngine in the application object:

C#
public class Global : System.Web.HttpApplication
{
    static SpellEngine spellEngine;
    static public SpellEngine SpellEngine { get { return spellEngine; } }
    ...
}

After that, we initialize this object in the Application_Start event and dispose it in the Application_End event in the Global class:

C#
protected void Application_Start(object sender, EventArgs e)
{
    try
    {
        string dictionaryPath = Server.MapPath("Bin") + "\\";
        Hunspell.NativeDllPath = dictionaryPath;

        spellEngine = new SpellEngine();
        LanguageConfig enConfig = new LanguageConfig();
        enConfig.LanguageCode = "en";
        enConfig.HunspellAffFile = dictionaryPath + "en_us.aff";
        enConfig.HunspellDictFile = dictionaryPath + "en_us.dic";
        enConfig.HunspellKey = "";
        enConfig.HyphenDictFile = dictionaryPath + "hyph_en_us.dic";
        enConfig.MyThesIdxFile = dictionaryPath + "th_en_us_new.idx";
        enConfig.MyThesDatFile = dictionaryPath + "th_en_us_new.dat";
        spellEngine.AddLanguage(enConfig);
    }
    catch (Exception ex)
    {
        if (spellEngine != null)
            spellEngine.Dispose();
    }
}

protected void Application_End(object sender, EventArgs e)
{
    if( spellEngine != null )
        spellEngine.Dispose();
    spellEngine = null;

}

After that, we can access our SpellEnginge from our ASPX pages. For example:

ASP.NET
<%@ Page Language="C#" AutoEventWireup="true" 
  CodeBehind="Default.aspx.cs" Inherits="WebSampleApplication._Default" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" >
<head runat="server">
    <title></title>
</head>
<body>
    <form id="form1" runat="server">
    <div>
    
        <asp:TextBox ID="QueryText" runat="server"></asp:TextBox>
        <asp:Button ID="SubmitButton" runat="server" Text="Search" />
        <br />
        <asp:Literal ID="ResultHtml" runat="server" />
    
    </div>
    </form>
</body>
</html>
C#
protected void Page_Load(object sender, EventArgs e)
{
    if (Page.IsPostBack)
    {
        string queryText = QueryText.Text;

        bool correct = Global.SpellEngine["en"].Spell(queryText);

        string result = "<br />";

        if (correct)
        {
            result += Server.HtmlEncode(queryText) + " is correct.<br />Synonyms:<br />";
            ThesResult meanings = Global.SpellEngine["en"].LookupSynonyms(queryText,true);
            if (meanings != null)
            {
                foreach (ThesMeaning meaning in meanings.Meanings)
                {
                    result += "<b>Meaning: " + 
                      Server.HtmlEncode(meaning.Description) + "</b><br />";
                    int number = 1;
                    foreach (string synonym in meaning.Synonyms)
                    {
                        result += number.ToString() + ": " + 
                          Server.HtmlEncode(synonym) + "<br />";
                        ++number;
                    }
                }
            }
        }
        else
        {
            result += Server.HtmlEncode(queryText) + 
              " is not correct.<br /><br />Suggestions:<br />";
            List<string> suggestions = Global.SpellEngine["en"].Suggest(queryText);
            int number = 1;
            foreach (string suggestion in suggestions)
            {
                result += number.ToString() + ": " + 
                  Server.HtmlEncode(suggestion) + "<br />";
                ++number;
            }
        }

        ResultHtml.Text = result;
    }
}

Use in Commercial Applications and Available Dictionaries

Due to the LGPL and MPL licenses, NHunspell can be used in commercial applications. It is allowed to link against the NHunspell.dll assembly in closed source projects. NHunspell uses the Open Office dictionaries; most of these dictionaries are available for free. The use of NHunspell in commercial / closed source applications is permitted.

Resources

The Open Office '.oxt' extensions are in fact Zip files. To use them with NHunspell, unzip the dictionaries they contain.

Important: Check the dictionary license before you use it!

Works with NHunspell too.

License

This article, along with any associated source code and files, is licensed under The GNU Lesser General Public License (LGPLv3)