Class diagram V2.2, a new version
Screenshot from the client application, this application only shows how to handle the components
If you start writing, you will see similar words to the one you are writing in a floating listbox if the word you are writing exists in the lexicon of course.
Introduction
I am developing a three tier system with forms to fill on the client layer. But I've felt that I had to create some kind of component to keep track of words that the user wrote earlier. If the user wrote a city name in a text field for an example and later needs to write it again, the component should give the user a list of similar words to chose from. I want the component to be easy to implement for the developers. I have tested this component with 120,000 words and it was running fine. I have also used it with several text fields, it can handle one ore more TextBoxes and/or RichTextBoxes.
Using the code
The components are spread out in different namespaces. The namespace TextHelper
contains the TextBoxHelper
which is an object to connect a TextBox
or a RichTextBox
to the property connectTextBox
which handles both a TextBox
and a RichTextBox
because of the TextBoxBase
. This helper will automatically handle all the dirty work for the client application. To this object you need to connect a ListBox
through the property connectFloater
which will automatically fill and display the listbox with similar words to the one being written. And last, you need to connect the AutoText
object to this object to be able to find similar words. This object is in a second namespace TextEngine
. Why is this? If you want to use the TextEngine
in a console application and don't need the TextBox
part as you do if you are developing a Win32 graphical application. This component exports the lexicon to a file when needed, and also imports from one. Now in the client application, to release the functions the TextBoxHelper
will bring, do this:
using TextHelper;
using TextHelper.TextEngine;
private ITextBoxHelper tH;
private IAutoText sT;
...
public Client()
{
InitializeComponent();
this.tH = new TextBoxHelper(20);
this.sT = new AutoText();
this.tH.connectSimpleText = sT; Connect the AutoText to the TextBoxHelper.
this.tH.connectFloater = listBox1;
this.tH.connectTextBox = textBox1;
}
Back to top?
Now add a TextBox
or a RichTextBox
(or more if you like) to your client application. A ListBox
, and a Button
to add words, and one to export and a third to import. You do not have to do anything with the event handlers for the TextBox
/RichTextBox
or the ListBox
. The TextHelper
object will automatically add these to them when the TextBox
and ListBox
are connected. But you will need an event for the button to add words to the lexicon, and for the export and import buttons. And in the button event function to add words, you add these lines:
private void button1_Click(object sender, System.EventArgs e)
{
if(textBox1.Text!=String.Empty)
{
this.tH.addWord();
button1.Text = "Submit new word to the lexicon" + " :" +
this.sT.Count.ToString();
}
}
private void button2_Click(object sender, System.EventArgs e)
{
this.tH.exportToFile();
}
private void button3_Click(object sender, System.EventArgs e)
{
this.tH.importFromFile();
button1.Text = "Submit new word to the lexicon" + " :" +
this.sT.Count.ToString();
}
Back to top?
Here is the interface ITextBoxHelper
the TextBoxHelper
object inherits from..
using System;
using System.Windows.Forms;
using TextHelper.TextEngine;
namespace TextHelper
{
public interface ITextBoxHelper
{
ListBox fillList (ref ListBox list_to_fill);
bool exportToFile ();
bool importFromFile ();
void addWord ();
TextBoxBase connectTextBox {set;}
ListBox connectFloater {set;}
IAutoText connectSimpleText {set;}
string getWritten {get;}
}
}
Back to top?
The TextBoxHelper
class that inherits from this interface looks like this..
using System;
using System.Drawing;
using System.Windows.Forms;
using TextHelper.TextEngine;
namespace TextHelper
{
public class TextBoxHelper: ITextBoxHelper
{
private ListBox mFloater;
private TextBoxBase mTextBox;
private IAutoText mAT;
private int mFloatLines;
private char mKey;
private string mWritten;
private const int ACTIVE_ON = 3;
public TextBoxHelper(int float_lines){this.mFloatLines=float_lines;}
public TextBoxBase connectTextBox
{
set
{
this.mTextBox = value;
this.mTextBox.TextChanged += new EventHandler(my_box_TextChanged);
this.mTextBox.KeyDown += new KeyEventHandler(this.my_box_KeyDown);
this.mTextBox.GotFocus += new EventHandler(my_box_GotFocus);
}
}
public ListBox connectFloater
{
set
{
this.mFloater = value;
this.mFloater.KeyPress +=new KeyPressEventHandler(my_list_KeyPress);
}
}
public IAutoText connectSimpleText
{
set
{
if(this.mAT==null)
{
this.mAT = value;
}
}
}
private void my_box_TextChanged(object sender, System.EventArgs e)
{
this.hideSimilarsList ();
int remend = ((TextBox)sender).SelectionStart;
int remstart = ((TextBox)sender).SelectionStart;
for(int i=remend-1;i>=0;i--)
{
if(((TextBox)sender).Text[i]==' ')
{
remstart = i+1;
break;
}
else if(i==0)
{
remstart = i;
break;
}
else if(((TextBox)sender).Text[i-1]=='\n')
{
remstart = i;
break;
}
}
this.mWritten = ((TextBox)sender).Text.Substring(remstart,remend-remstart);
if(this.mWritten.Length>=ACTIVE_ON)
this.mAT.autoFill(this.mWritten);
if(this.mFloater!=null&&this.mWritten.Length>=ACTIVE_ON)
this.fillList(ref this.mFloater);
}
public void addWord()
{
this.mAT.addWord(this.mTextBox.Text);
}
private void my_list_KeyPress(object sender,
System.Windows.Forms.KeyPressEventArgs e)
{
if((e.KeyChar>=32&&e.KeyChar<=255)||e.KeyChar==8)
{
this.mKey = e.KeyChar;
this.mTextBox.Focus();
}
else if(e.KeyChar==13&&this.mFloater.Items.Count>0)
{
int sel_start = this.mTextBox.SelectionStart;
int nr = this.mAT.getLetterHits();
int end_pos = sel_start + this.mFloater.
SelectedItem.ToString().Remove(0,nr).Length;
this.mTextBox.Text = this.mTextBox.Text.Insert(
sel_start,this.mFloater.SelectedItem.ToString().Remove(0,nr));
this.mTextBox.SelectionStart = end_pos;
this.mTextBox.ScrollToCaret ();
this.mTextBox.Focus ();
}
else
{
this.hideSimilarsList ();
this.mTextBox.Focus ();
}
}
}
}
This class is doing the Win32 app work for you and the clients. It gets the text you are currently writing and checks the word with the lexicon using the AutoText
object.
Back to top?
Shown below is the interface IAutoText
:
using System;
namespace TextHelper
{
namespace TextEngine
{
public interface IAutoText
{
void addWord (string word_to_add);
void addAndClean (string word_to_add);
string autoFill (string text_to_match);
string getNext (string text_to_match);
bool exportToLexiconFile (string filename);
bool importFromLexiconFile(string filename);
int getLetterHits ();
string getWord {get;}
int Count {get;}
}
}
}
Back to top?
And the class that inherits from IAutoText
:
using System;
using System.IO;
using System.Collections;
namespace TextHelper
{
namespace TextEngine
{
public class AutoText: IAutoText
{
private StringList mWords;
private int mRemember;
private int mSimilar;
private int mSimilars;
private int mSimilarTree;
private const int SMALLEST_WORD = 5;
private const int BIGGER = 2;
SMALLEST_WORD
is there to limit the word lenght, to store a word only if the length is 4 or more. BIGGER
will limit words being displayed in the list to those having the length of the written word + 2 letters. We don't want to show words in the list that only have one more letter than the written one.
public AutoText()
{
this.mWords = new StringList();
this.mSimilar = this.mSimilars = this.mSimilarTree = -1;
}
private int checkLetters(string written_word, string lexicon_word)
{
int letter_counter = 0;
for(int i=0;i<written_word.Length;i++)
{
if(letter_counter<written_word.Length)
{
if(lexicon_word[letter_counter]==written_word[letter_counter]||
(letter_counter==0&&lexicon_word[letter_counter].ToString().ToLower()
==written_word[letter_counter].ToString().ToLower()))
letter_counter++;
else return 0;
}
}
return letter_counter;
}
public string autoFill(string text_to_match)
{
this.mSimilar = this.mSimilars = -1;
if(text_to_match!=String.Empty)
if(((StringList)this.mWords[text_to_match.ToLower()[0]]).Count>0)
return this.autoText(text_to_match, 0);
else
return "";
else return "";
}
public string getWord
{
get
{
if(this.mSimilar>=0)
return ((StringList)
this.mWords[this.mSimilarTree])[this.mSimilar].ToString();
else return "";
}
}
public string getNext(string text_to_match)
{
if(((StringList)this.mWords[text_to_match.ToLower()[0]]).Count>0)
{
int rememSim = this.mSimilars;
int rmemeThr = this.mSimilarTree;
string word = this.autoText(text_to_match, this.mSimilars+1);
if(rememSim==this.mSimilars)
this.mSimilars=-1;
return word;
}
else
return "";
}
private string autoText(string text, int start)
{
int hits=0;
for(int i=start;i<((StringList)this.mWords[text.ToLower()[0]]).Count;i++)
{
int len = ((StringList)this.mWords[text.ToLower()[0]])[i].ToString().Length;
string lexicon_word = ((StringList)this.mWords[text.ToLower()[0]])[i].ToString();
int hit = 0;
if(text.Length+BIGGER<=len)
{
hit = this.checkLetters(text,lexicon_word);
if(hits<hit)
{
hits = hit;
this.mSimilar = this.mSimilars = i;
this.mSimilarTree = text.ToLower()[0];
this.mRemember = hits;
}
}
}
if(this.mSimilar>=0)
return ((StringList)this.mWords[this.mSimilarTree])[this.mSimilar].ToString();
else return "";
}
Back to top?
The following addWord(..)
function is called when the user imports from a file. Here we do not need to check every word because the file should be OK.
public void addWord(string word_to_add)
{
if(word_to_add.Length>=SMALLEST_WORD)
this.mWords.Add (word_to_add);
}
The following addAndClean(...)
function is called when the user application adds a single or a few new words to the lexicon.
public void addAndClean(string word_to_add)
{
word_to_add = word_to_add.Replace('\t',' ');
word_to_add = word_to_add.Replace('\n',' ');
word_to_add = word_to_add.Replace('\r',' ');
string[]tempText = word_to_add.Split(' ');
for(int i=0;i<tempText.Length;i++)
{
tempText[i] = tempText[i].TrimEnd(ridof);
tempText[i] = tempText[i].TrimStart(ridof);
if(tempText[i]!=String.Empty)
{
if(!this.checkIfExist(tempText[i]))
{
tempText[i] = tempText[i][0].ToString().ToLower()+
tempText[i].Remove(0,1);
this.addWord(tempText[i]);
}
}
}
this.mWords.Sort();
}
public bool exportToLexiconFile(string filename)
{
try
{
using (StreamWriter sw = new StreamWriter(filename))
{
for(int i=0;i<this.mWords.Count;i++)
for(int o=0;o<((StringList)this.mWords[i]).Count;o++)
sw.WriteLine(((StringList)this.mWords[i])[o].ToString());
}
return true;
}
catch(Exception)
{
return false;
}
}
public bool importFromLexiconFile(string filename)
{
try
{
using (StreamReader sw = File.OpenText(filename))
{
string line="";
while ((line = sw.ReadLine()) != null)
{
this.addWord(line);
}
this.mWords.Sort();
}
return true;
}
catch(Exception)
{
return false;
}
}
Back to top?
Here is an interesting composition which inherits from ArrayList
. And from IComparer
to sort by the string size.
private class StringList: ArrayList, IComparer
{
const int MAX_TREE = 255;
static bool mZeroStart = true;
int IComparer.Compare( object x, object y )
{
if(x is string && y is string)
{
return( Comparer.Default.Compare(Convert.ToInt32(x.ToString().Length),
Convert.ToInt32(y.ToString().Length)));
}
else
{
return( Comparer.Default.Compare( x,y));
}
}
The following function has to do something, only once, the first time, and then act like an empty normal constructor. When the AutoText
object creates an instance of StringList
, it needs to fill itself with more StringList
s. Each StringList
it creates will be unique for the first letter in a word, creating an indexing algorithm, which means that if you want to add a word beginning with an a, it will store this word at the index of the value of a.
public StringList(): base()
{
if(mZeroStart==true)
{
mZeroStart = false;
for(int i=0;i<MAX_TREE;i++)
base.Add(new StringList());
}
}
Now we are not going to override but hide the base function with a new one. We want to be able to sort here when the caller decides to sort.
new public void Sort()
{
for(int i=0;i<this.Count;i++)
if(((ArrayList)this[i]).Count>0)
((ArrayList)this[i]).Sort(this);
}
And we will have to do the same in the following:
new public int Add(object value)
{
int result = ((ArrayList)this[value.ToString()[0]]).Add(value);
return result;
}
Back to top?
This AutoText
object is made so that you can use just this part (if you wish) with your console application. Or you can use the complete set if you are developing an ASP.NET Web project. The thought with this package was and is to remember text the user has written earlier to give the user the ability to use this text in a later occasion.
Back to top?
Points of interest
If you want more TextBox
es to deal within your application, you will have to add a TextBoxHelper
and a ListBox
to each textbox you add. Here you can either connect the same AutoText
to every one of these TextBoxHelper
s as that will bring you the text from the same lexicon. Or you can create several new AutoText
lexicons to add to whichever ones you want. You might want a different language in another textbox or something. Pick 'n chose!
This is an updated version V2.2. This one can handle word by word matching. To see all the functions and code, you will have to download it. I made this one more effective as it loads and handles bigger word files than the previous versions. This version uses indexing based on the first letter in the word. The versions before did search through every string in the whole lexicon for words beginning with the same letter as the written one, which was too heavy. Now it is faster with larger lexicons. You can use this code at your own risk, with no warranty. I have tested it with Visual Studio .NET 2003 on an XP driven PCIIII 3GHz, .NET 1.1, and with a Swedish text file filled with 120,000 words. It also uses a comparing class to sort by string size.
Back to top?
History
- Version 1.0, uploaded ? - This had only sentence support and was heavy.
- Version 1.1, uploaded - 7 June 2007 - This had both word and sentence matching, but still heavy.
- Version 1.2, uploaded - 8 June 2007 - Made it easier for the processor.
- Version 1.3, uploaded - 11 June 2007 - Made a client with multiline boxes, and working.
- Version 2.0, uploaded - 15 June 2007 - Removed the recursive calls. There were too many calls! And changed the client application so that it can handle both
TextBox
and RichTextBox
. Reordered the classes in the namespaces, with new different dependencies. Replaced the ToolTip
with a floating listbox to show similar words in. And I rewrote this article. - Version 2.1, uploaded - 16 June 2007 - Now it does not matter if it is lowercase or uppercase, inserted a function to import and export to a file. And I have updated this article.
- Version 2.2, uploaded - 20 June 2007 - Now you can import files big enough to handle over 120,000 words, and there is an indexing search algorithm implemented which indexes on the first letter in the stored words.
- Version 2.2, uploaded - 28 August 2007 - Same as previous, a Visual Studio 2005 solution added.
- Version *, changed - 27 November 2007 - Made some changes to the article.
Back to top?
License
This software is provided 'as-is' without any express or implied warranty. In no event will the author(s) be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose including commercial applications. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required.
Back to top?