Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

CodeBox for Windows Forms

0.00/5 (No votes)
12 Nov 2009 1  
A RichTextBox for Windows Forms that supports flexible highlighting and background coloring.

Introduction

This article presents a WinForms text edit control that supports a flexible highlighting and word coloring decoration system. My goal is to present this control and the optimizations that were necessary for it to run at a reasonable speed simply enough that a C# novice would be able to understand it.

Background

This is the third CodeBox that I have put up. The first two were built for use with WPF. I would like to think that CodeBox and its radically redesigned descendant CodeBox 2 would be useful even though this WinForms implementation is very different. They all share the same basic decoration concept.

Using the Code

The code should not be hard to use as the control is inherited from the RichTextBox. Without adding Decorations, it is almost indistinguishable from the RichTextBox. Decorations fall into two major categories corresponding to the DecorationScheme and Decoration classes. A DecorationScheme is basically just a conveniently grouped collection of Decoration items. For example:

codeBox.DecorationScheme = WinFormsCodeBox.Decorations.DecorationSchemes.CSharp3;

will give a code coloration similar to what you would see in Visual Studio for C#, while:

codeBox.DecorationScheme = WinFormsCodeBox.Decorations.DecorationSchemes.Xml;

will give you coloration similar to the appearance of XML in Visual Studio. The DecorationScheme is intended to set the basic look of the text. After that, we can set additional decorations.

LineDecoration ld = new LineDecoration()
{
  DecorationType = EDecorationType.Hilight 
  ,Color = Color.Yellow
  ,Line =2
};
codeBox.Decorations.Add(ld);

The above code will highlight line two of the CodeBox in yellow. Please note that adding a decoration will not automatically update the display. Updates occur when either the Text changes, or:

codeBox.ApplyDecorations(); 

is called. At present, there are a number of premade decorations:

  • StringDecoration: Decoration based on index positions of a single string
  • MultiStringDecoration: Decoration based on index positions of a list of strings
  • RegexDecoration: Decoration based on a single Regular Expression string
  • MultiStringDecoration: Decoration based on a list of Regular Expression strings
  • ExplicitDecoration: Decoration explicitly specified as a starting position and a length - simple but useful when working with selection
  • MultiExplicitDecoration: Decoration explicitly specified as a list of starting positions and lengths
  • MultiRegexWordDecoration: Decoration based on a list of strings sandwiched between word boundaries
  • DoubleQuotedDecoration: Decoration of text between double quotes
  • LineDecoration: Decoration of a specified line of text
  • MultiLineDecoration: Decoration of a list of specified lines of text
  • DoubleRegexDecoration: Decoration based on a pair of Regular Expression strings where the second expression is matched against the results of the first
  • RegexMatchDecoration: Decoration based on both the match and the group for a Regular Expression

Let's look at a few examples:

ExplicitDecoration ed = new ExplicitDecoration()
{
     Start = this.CodeBox.SelectionStart,
     Length = this.CodeBox.SelectionLength,
     DecorationType = EDecorationType.TextColor ,
     Color = Color.Green
};
this.CodeBox.Decorations.Add(ed);

Assuming that we had a WinFormsCodeBox named CodeBox, this would make the text color of the selection Green.

RegexDecoration singleLineComment = new RegexDecoration()
{
    DecorationType = EDecorationType.TextColor,
    Color = Color.Green,
    RegexString = "//.*"
};

This decoration will color single line comments Green (C# style).

private static List<string> CSharpVariableReservations()
{
    return new List<string>() { "string", "int", "double", 
           "long", "void" , "true", 
           "false", "null"};
}

MultiRegexWordDecoration BlueClasses = new MultiRegexWordDecoration()
{
    Color = Color.Blue,
    Words = CSharpVariableReservations(),
    IsCaseSensitive = true
};

These together will make the words defined in CSharpVariableReservations blue. Note that string would be blue, but happystring would not be so colored.

RegexMatchDecoration xmlAttributeValue = new RegexMatchDecoration()
{
    Color = Color.Blue,
    RegexString = @"\s(\w+|\w+:\w+|(\w|\.)+)\s*=\s*""(?<selected />.*?)"""
};

This will make the attribute portion of XML tags Red.

There are premade decoration schemes for C#, SQL Server, XAML, DBML, and XML. Admittedly, they could probably use a bit of refinement, but they all work pretty well. This is pretty much all one needs to know in order to put WinFormsCodeBox to use.

How It Works

The Basic Idea

WinFormsCodeBox inherits from RichTextBox. The decorations that we want to make are created and are applied by moving the selection around and setting the SelectionColor and SelectionBackColor properties.

Decorations are defined in terms of the TextIndex class. (Please note that in the previous CodeBox articles, TextIndex was called a Pair.)

namespace TextUtils
{
    /// <summary>
    /// A pair of integers referring to the starting position
    /// and length of a piece of text
    /// </summary>
    public class TextIndex : IComparable<textindex>
    {
        /// <summary>
        ///The integer position of the first character
        /// </summary>
        public int Start { get; set; }
        
        /// <summary>
        /// Number of characters in range
        /// </summary>
        public int Length { get; set; }
        
        ... Other stuff
      }
}

These are grouped together in TextIndexLists:

public class TextIndexList : List<textindex>
{ ... lots of methods}

These TextIndexLists are created by the various decorations. The decoration classes all are descended from the abstract class Decoration.

public abstract class Decoration
{
    public EDecorationType DecorationType { get; set; }
    public Color Color{ get; set; }
    public abstract TextIndexList  Ranges(string text);
    
    ... other stuff
}

These decorations are then applied to the WinFormsCodeBox through the ApplyDecoration method.

private void ApplyDecoration(Decoration d, TextIndexList tl)
{
    switch (d.DecorationType)
    {
         case EDecorationType.TextColor:
             foreach (TextIndex t in tl)
             {
                 this.Select(t.Start, t.Length);
                 this.SelectionColor = d.Color;
             }
             break;
         case EDecorationType.Hilight :
             foreach (TextIndex t in tl)
             {
                 this.Select(t.Start, t.Length);
                 this.SelectionBackColor  = d.Color;
             }
             break;
    }
}

Problems and First Optimization

There are just two little problems with the code as implemented so far. It is too slow to be used, and it suffers from the scrollbar jittering up and down as one types. When this happens, we need to decide what to do. Surrender is a reasonable option, but before one gives up, one should see if there is at least a little hope. One of the problems is that the OnTextChanged event of the RichTextBox is fired not just when the characters are changed, but for each formatting change. It is easy to fix this.

protected override void OnTextChanged(EventArgs e)
{  
     base.OnTextChanged(e);
     if (!mDecorationInProgress)
     {
         ApplyDecorations();
     }
}

The mDecorationInProgress is set to true at the beginning of the ApplyDecorations method and then back to false at the end. This has a significant impact on the speed, but not nearly enough to make the control useable. The problem with the screen is that whenever the selection is changed, the textbox scrolls to make the selection visible. The jumps occurs when it is necessary to scroll down to make something visible and then scroll back up to where we started. It could be off by a few lines. If the RichTextBox has a vertical scroll position property, this would be easy to deal with, but it does not. Fortunately, I had been plagued by this in the past, so I looked up what I did back then. COM Interop saves the day.

[DllImport("user32.dll")]
private static extern int SendMessage(IntPtr hwndLock, Int32 wMsg, 
                                      Int32 wParam, ref Point pt

private Point ScrollPosition
{
    get
    {
        const int EM_GETSCROLLPOS = 0x0400 + 221;
        Point pt = new Point();
        SendMessage(this.Handle, EM_GETSCROLLPOS, 0, ref pt);
        return pt;
    }
    set
    {
        const int EM_SETSCROLLPOS = 0x0400 + 222;
        SendMessage(this.Handle, EM_SETSCROLLPOS, 0, ref value);
    }
}

This just happens to be the correct function. I'm sure that I have some unnamed guru on the internet to thank for that one. With a property like this, we just call:

Point origScroll = ScrollPosition;

before working with the selection, and:

ScrollPosition = origScroll;

after we are finished. This completely takes care of the jumping problem. By a stroke of good fortune, there was another piece of useful code in my old project, the import for locking out screen updates:

[DllImport("user32", CharSet = CharSet.Ansi, SetLastError = true, ExactSpelling = true)]
private static extern int LockWindowUpdate(int hWnd);

Before working with the selections, we call:

LockWindowUpdate(this.Handle.ToInt32());

and afterwards, we call:

LockWindowUpdate(0);

Saving old code is good. By this point, the control can be used for relatively small amounts of text, like the definition of a typical Stored Procedure. The question is whether the control can be optimized further. Some timing data shows that the applying of decorations takes between 100 and 1000 times as long as determining what decorations need to be applied. Given that, I formulated three possible strategies for further optimization:

  • Poke around in the class with Reflector and look for helpful internal methods to set the properties
  • Go low level and start working directly with RTF
  • Try to apply the decorations more efficiently

My first instinct was to use Reflector. In WPF, this usually works wonders. Here's what the set portion of the SelectionColor property looks like:

set
{
    this.ForceHandleCreate();
    NativeMethods.CHARFORMATA charFormat = this.GetCharFormat(true);
    charFormat.dwMask = 0x40000000;
    charFormat.dwEffects = 0;
    charFormat.crTextColor = ColorTranslator.ToWin32(value);
    UnsafeNativeMethods.SendMessage(new HandleRef(this, base.Handle), 
                                    0x444, 1, charFormat);
}

That was enough to convince me that Reflector was not going to give me something easily. In the past, I have worked with RTF, and it is not something that I would do as anything other than a last resort. That left me with the last possibility. The RTF that sits in the RichTextBox is a persistent medium. We do not have to update everything on each update. We could just update the areas that have changed and need to be updated. Doing this requires a closer look at TextIndex and TextIndexList.

Lists of TextIndexes and Second Optimization

In order to only modify the changed parts of the text, we need to be able to differentiate TextindexLists. There are three different ways that we can look at a TextindexList:

  • A TextIndexList is a List.
  • A TextIndexList can be thought of as set of line segments on a line.
  • A TextIndexList can be thought of as a BitArray.

For example, consider the following TextIndexList:

TextIndexList tl = new TextIndexList();
tl.Add(new TextIndex() { Start = 1, Length = 2 });
tl.Add(new TextIndex() { Start = 4, Length = 2 });

which can be created more concisely by:

TextIndexList tl = TextIndexList.Parse("1,2:4,2");

This contains the same information as the following line segment:

which is the same as the bit array of [false,true,true,false,true,true]. You might be a bit skeptical and wonder if I just happened to pick a good example. What would it mean if the TextIndexes overlapped. That is the important point. The decorations are designed so that a double application of one is the same as a single application. If we have a yellow background and it overlaps with another yellow background, it is the same as a single bigger yellow background. Order also has no meaning, so the following two TextIndexList objects are effectively equivalent:

TextIndexList tl1 = TextIndexList.Parse("1,2:4,2"); 
TextIndexList tl2 = TextIndexList.Parse("4,2:1,2");

Geometric Interpretation

The easiest way to understand how to determine the minimum set of TextIndexes that need to be changed is by looking at the situation geometrically in terms of line segments. So, let's consider the situation where there is only one decoration. There are two TextIndexLists representing where in the text the decorations would be applied.

The two things that one should notice are, the range where the TextIndexLists are different is clearly defined, and their differences can be thought of as a TextIndexList. The bounds of this new TextIndexList can be thought of as a TextIndex. When updating the display, we only need to concern ourselves with updating the text formatting in the "Where Different Bounds" area. Usually, we have more than one decoration. The decoration scheme for SQL Server contains 11.

When we have more decorations, they can be visualized as stacked on top of each other. The set of the difference bounds for the individual decorations form a TextIndexList. From that, I can get an overall difference range which is the area that would need to be updated. Finally, we get the actual decorations to be applied by projecting the original decorations onto this combined range.

BitArray Interpretation

To calculate the TextIndexLists, we can turn to the BitArray interpretation of the TextIndexList. The routine to produce a BitArray from a TextIndexList is straightforward.

public BitArray ToBitArray(int size)
{
    BitArray bits = new BitArray(size);
    foreach (TextIndex t in this)
    {
        int maxVal = Math.Min(size, t.Start + t.Length);
        for (int i = t.Start; i < maxVal; i++)
        {
            bits[i] = true;
        }
    }
    return bits;
}

The size parameter is just there for a bit of extra flexibility. As long as it is greater than or equal to the upper bounds of the TextIndexList, the conversion will be complete. The conversion back is a little harder to follow.

public static TextIndexList FromBitArray(BitArray bits)
{
    return FromBitArray(bits, new TextIndex (0, bits.Length));
}

public static TextIndexList FromBitArray(BitArray bits, TextIndex index)
{
    string bitString = BitArrayString(bits);
    TextIndexList tl = new TextIndexList();
    int currentStart = -1;
    int lastBit = Math.Min(index.Start + index.Length, bits.Length);
    for (int i = index.Start; i < lastBit; i++)
    {
        if (bits[i])
        {
            if (currentStart == -1)
            {
                currentStart = i;
            }
        }
        else
        {
            if (currentStart != -1)
            {
                tl.Add(TextIndex.FromStartEnd(currentStart, i  ));
                currentStart = -1;
            }
        }
    }
    if (currentStart != -1)
    {
        tl.Add(TextIndex.FromStartEnd(currentStart, index.End ));
    }
    return tl;
}

Code like this is why we have unit tests. The interesting thing to note is this code:

TextIndexList  tl = TextIndexList.FromBitArray(tl.ToBitArray());

It both merges the overlapping TextIndexes and sorts the TextIndexList tl.

In order to find the differences in the two TextIndexLists, we can take the Symmetric Difference of the BitArrays.

public TextIndexList SymetricDifference(TextIndexList tl)
{
    int arraySize = Math.Max(this.Bounds.End, tl.Bounds.End);
    BitArray bArray = this.ToBitArray(arraySize);
    BitArray btlArray = tl.ToBitArray(arraySize);

    BitArray bResult = bArray.Xor(btlArray);
    return TextIndexList.FromBitArray(bResult);
}

If this is not obvious, look at the line segment picture for a bit and it should be. I'm sure that this could be done without BitArrays, but the XOR makes it come out much cleaner. Furthermore, we can create the projections by using the FromBitArray method. Altering the starting and ending points of the loop can be used to restrict the TextIndexList to a specified TextIndex.

Third Optimization - Shifting

The second optimization seemed like it should significantly improve things, but the improvement turned out to be rather modest. The problem turned out to be the nature of changes to the text. The most common way a textbox's text is changed is through typing. Each time a key is pressed under normal circumstances (no delete, backspace, or previous selection), the position of every character in the document after the insertion point is increased by one. This means that the area that the updates are restricted to runs from around the current character until the end of the last decoration in the document. It helps a lot at the very end, but not at all at the beginning. Fortunately, this is easy to fix. The TextDelta class takes two strings in its constructor and finds the first difference and the offset connected with the text change. Please note that I am taking advantage of the fact that it is not possible to make a noncontiguous single edit with keyboard and mouse. If we apply the very simple shift function to the previous TextIndexes:

public void Shift(int startingIndex, int amount)
{
    foreach (TextIndex t in this)
    {
         if (t.Contains(startingIndex ) )
                {
                    t.Length  +=amount;
                }
                else if (t.Start > startingIndex)
                {
                    t.Start += amount;
                }
    }
}

the hoped for performance boost arrives and we have a control that is optimized enough for practical use. (Please note that this is the updated version of the "very simple shift function." The original had an error in it.)

Conclusion

My intention in this article was both to present a useful control and to make it easy to understand. I'm pretty sure that the former was successful, but I have some doubt about the latter. Optimized code is often mysterious. We find routines that exist only to increase speed in various special situations. Without knowing the history, we often end up wondering if this strange construction existed because the previous programmer didn't know what he was doing or was in love with cut and paste. I did not want to present this code as if it was the obvious way of handling the situation. Perhaps it is, but it certainly was not obvious to me. Hopefully, this short saga of transformation of a textbox that started requiring about two minutes (for a large file) per keystroke to around .03 seconds (4,000 times better) can be of some service. I would also like to give special thanks to Arthur Jordison whose encouragement made this project prioritized enough to get done.

Updates

11/1/2000 - Bug Fix

The Shift Method in the TextIndexList was not taking the possibility that the start of the shift might be within the TextIndex. It does now.

11/11/2000 - Bug Fix

Problems with pasting, undo and changes made within variable length decorations have been fixed. This showed a serious deficiency in undo functionality which is in the process of being corrected. I will make a detailed update of my explanation upon its completion.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here