Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Real Time syntax highlighting [wrapper class]

0.00/5 (No votes)
1 Feb 2004 1  
Syntax highlight your RichTextBoxes.

Sample Image - screenshot.png

Introduction

Hello again. This time I wanted to bring you something that was really challenging for me to build. I must have tried at least four totally different solutions to this problem consuming literally days of my time until I reached the best one. At the request of our beloved staff, I will attempt to explain :-)

This class is an extension of a wrapper to the RichEdit control that ships with .NET. How it achieves its speed and also its slickness is by directly modifying the RTF document behind the scenes.

Please note

I am aware that the regex matching can produce unwanted results. 99 out of 100 times, it's the regex pattern you are using. Example: the pattern "<IMG.*>" is wrong. It will only match the FIRST image tag in the document, the proper format is "<IMG.*?>". You need to test your regex's out before you put them in the code otherwise you may get unwanted results.

You can use the RegEx tool Expresso, get it here. My thanks go out to Jim Hollenhorst for making such a great tool. Without Expresso, this project would not be as far along as it is.

The UI code:

First, I will tell you how to use this class in your program. Then I will outline how the class actually gets the job done. To initialize the wrapper is really quite simple. All that is to be done is call New on the class, specifying that you want its events as well. You can accomplish this like so:

Private WithEvents RTBWrapper As New cRTBWrapper

This creates a new instance of the class. However, there is one more important thing you need to do. You need to bind the control to the rich textbox. This can be done in the onLoad Sub or wherever you like.

RTBWrapper.bind(RTB)

Next, you can fill it with whatever you want, syntax from an XML document containing keywords for your programming language, whatever. For this example, I will give it just a few words to "look" for. You could make an initialization Sub to load all this or you could put it in onLoad as well, like so:

    .rtfSyntax.add("<span.*?>", True, True, Color.Red.ToArgb)
    .rtfSyntax.add("<p.*>", True, True, Color.DarkCyan.ToArgb)
    .rtfSyntax.add("<a.*?>", True, True, Color.Blue.ToArgb)
    .rtfSyntax.add("<table.*?>", True, True, Color.Tan.ToArgb)
    .rtfSyntax.add("<tr.*?>", True, True, Color.Brown.ToArgb)
    .rtfSyntax.add("<td.*?>", True, True, Color.Brown.ToArgb)
    .rtfSyntax.add("<img.*?>", True, True, Color.Red.ToArgb)
    .rtfSyntax.add("codeproject", False, True, Color.Red.ToArgb)
    .rtfSyntax.add("not regex and case sensitive", _
                         false, False, Color.Red.ToArgb)

You can also toggle the debug view, I like doing this in a menu and use the code.

sender.checked = RTBWrapper.toggleDebug()

There is also an event that you can hook into, that will report the current positions of the caret within the window.

Private Sub RTBWrapper_position(ByVal PositionInfo _
                As cRTBWrapper.cPosition) Handles RTBWrapper.position
    StatusBar1.Text = "Cursor: " & PositionInfo.Cursor & _
                      ",  Line: " & PositionInfo.CurrentLine & _
                      ", Position: " & PositionInfo.LinePosition
End Sub

You also need to add the method colorDocument to the document load function.

Private Sub mnuOpen_Click(ByVal sender As System.Object,_
               ByVal e As System.EventArgs) Handles mnuOpen.Click
    If OpenDialog.ShowDialog Then
        RTB.LoadFile(OpenDialog.FileName, RichTextBoxStreamType.PlainText)
        RTBWrapper.colorDocument()
    End If
End Sub

You can also hook in to the syntax editor. (The syntax editor is just a quick hack. After all, this is just a demo project.)

    Private Sub MenuItem4_Click(ByVal sender As System.Object, _
              ByVal e As System.EventArgs) Handles MenuItem4.Click
        Dim syntaxView As New cRTFSyntax
        syntaxView.colSyntax = RTBWrapper.rtfSyntax

        If syntaxView.ShowDialog = DialogResult.OK Then
            RTBWrapper.rtfSyntax = syntaxView.colSyntax
        End If

        RTBWrapper.colorDocument()
    End Sub

Completed UI code:

    Private WithEvents RTBWrapper As New cRTBWrapper

    Private Sub Form1_Load(ByVal sender As System.Object, _
               ByVal e As System.EventArgs) Handles MyBase.Load
        With RTBWrapper
            .bind(RTB)
            .rtfSyntax.add("<span.*?>", True, True, Color.Red.ToArgb)
            .rtfSyntax.add("<p.*>", True, True, Color.DarkCyan.ToArgb)
            .rtfSyntax.add("<a.*?>", True, True, Color.Blue.ToArgb)
            .rtfSyntax.add("<table.*?>", True, True, Color.Tan.ToArgb)
            .rtfSyntax.add("<tr.*?>", True, True, Color.Brown.ToArgb)
            .rtfSyntax.add("<td.*?>", True, True, Color.Brown.ToArgb)
            .rtfSyntax.add("<img.*?>", True, True, Color.Red.ToArgb)
        End With
    End Sub

    Private Sub RTBWrapper_position(ByVal PositionInfo As _
              cRTBWrapper.cPosition) Handles RTBWrapper.position
        StatusBar1.Text = "Cursor: " & PositionInfo.Cursor & _
           ",  Line: " & PositionInfo.CurrentLine & _
           ", Position: " & PositionInfo.LinePosition
    End Sub

    Private Sub mnuOpen_Click(ByVal sender As System.Object,_
             ByVal e As System.EventArgs) Handles mnuOpen.Click
        If OpenDialog.ShowDialog Then
            RTB.LoadFile(OpenDialog.FileName, RichTextBoxStreamType.PlainText)
            RTBWrapper.colorDocument()
        End If
    End Sub

    Private Sub MenuItem3_Click(ByVal sender As System.Object, _
              ByVal e As System.EventArgs) Handles MenuItem3.Click
        sender.checked = RTBWrapper.toggleDebug()
    End Sub

    Private Sub mnuSave_Click(ByVal sender As System.Object, _
             ByVal e As System.EventArgs) Handles mnuSave.Click
        If SaveDialog.ShowDialog Then
            RTB.SaveFile(SaveDialog.FileName, RichTextBoxStreamType.PlainText)
        End If
    End Sub

    Private Sub mnuExit_Click(ByVal sender As System.Object, _
                   ByVal e As System.EventArgs) Handles mnuExit.Click
        Me.Close()
    End Sub

    Private Sub mnuNew_Click(ByVal sender As System.Object,_
              ByVal e As System.EventArgs) Handles mnuNew.Click
        RTB.Text = ""
        RTB.Rtf = ""
    End Sub

    Private Sub mnuWordWrap_Click(ByVal sender As System.Object,_
                 ByVal e As System.EventArgs) Handles mnuWordWrap.Click
        RTB.WordWrap = Not RTB.WordWrap
        sender.checked = RTB.WordWrap
    End Sub

    Private Sub MenuItem4_Click(ByVal sender As System.Object,_
               ByVal e As System.EventArgs) Handles MenuItem4.Click
        Dim syntaxView As New cRTFSyntax
        syntaxView.colSyntax = RTBWrapper.rtfSyntax

        If syntaxView.ShowDialog = DialogResult.OK Then
            RTBWrapper.rtfSyntax = syntaxView.colSyntax
        End If

        RTBWrapper.colorDocument()
    End Sub

    Private Sub mnuColor_Click(ByVal sender As System.Object,_
              ByVal e As System.EventArgs) Handles mnuColor.Click
        RTBWrapper.colorDocument()
    End Sub
View Sample Screen Shot

There have been several updates to the class, I will post them when everything's finalized.

Very high-level diagram of the way the code works.

The class code:

First and probably most important are the Import directives. The RegularExpression direction is obvious because we are going to be using regular expressions in this class. The second directive of InteropServices.Marshal are so we can hook into the GetLastWin32Error method for the API functions. If you would like to read more about what we are using the API for, please see the Scrolling with API article.

Imports System.Text.RegularExpressions
Imports System.Runtime.InteropServices.Marshal

The class begins by creating three other classes for storage, these are called cList, cDictionary and cPosition. The first two classes (cList and cDictionary) are just standard run of the mill everyday collection classes for storing color information. The third class cPosition stores the cursors position information.

    Public Class cPosition
        Public Cursor As Integer
        Public CurrentLine As Integer
        Public LinePosition As Integer
        Public xScroll As Integer
        Public yScroll As Integer
    End Class

Let's start with where it all begins, with the update on the cursor position. We have declared the variable _bind. _bind is a reference to the RTB control. _bind was imported with events as such. We can hook in to those events using AddHandler.

        AddHandler _bind.KeyUp, AddressOf update
        AddHandler _bind.MouseUp, AddressOf update
        AddHandler _bind.TextChanged, AddressOf update

We have hooked into the KeyUp, MouseUp and TextChanged events of the control. These events, when they occur, call the overloaded update procedure. Why overloaded? Well, when these events call that Sub, they each call it with different parameters. Example:

    Private Overloads Sub update()
        CursorPosition = getCurrentPosition()
        RaiseEvent position(CursorPosition)
        debugprint(_bind.Rtf, False)
    End Sub
    
    Private Overloads Sub update(ByVal sender As Object, ByVal e _
                         As System.Windows.Forms.KeyEventArgs)  'KeyUp

        update()
    End Sub
    
    Private Overloads Sub update(ByVal sender As Object, ByVal e _
                         As System.Windows.Forms.MouseEventArgs)  'MouseUp

        update()
    End Sub
    
    Private Overloads Sub update(ByVal sender As Object, ByVal e _
                                     As System.EventArgs)  ' TextChanged

        rtfColors.Clear() ' Clear the colors

        readRTFColor()    ' Read and parse the colors in the current document

        readRTFBody()     ' Read and split the RTF into lines

        readTXTBody()     ' Read and split the text into lines

    End Sub

So by overloading the Subs, we can have different events calling the same function. You may notice that the TextChanged event is slightly different from the others. This loads the RTF document into the arrays of the class to prepare for parsing.

We also need to hook into the API for some non-standard functions such as GetScrollPos, SetScrollPos, PostMessageA and LockWindowUpdate. You can read about GetScrollPos, SetScrollPos and PostMessageA in the Scrolling with API article. For now, let's look at the definitions of these functions.

    Private Declare Function GetScrollPos Lib "user32.dll" ( _
        ByVal hWnd As IntPtr, _
        ByVal nBar As Integer) As Integer

    Private Declare Function SetScrollPos Lib "user32.dll" ( _
        ByVal hWnd As IntPtr, ByVal nBar As Integer, _
        ByVal nPos As Integer, ByVal bRedraw As Boolean) As Integer

    Private Declare Function PostMessageA Lib "user32.dll" ( _
        ByVal hwnd As IntPtr, ByVal wMsg As Integer, _
        ByVal wParam As Integer, ByVal lParam As Integer) As Boolean

    Private Declare Function LockWindowUpdate Lib _
             "user32.dll" (ByVal hwnd As Long) As Long

Basically, what these are going to get you is the ability to lock and reset your window's scroll. We use these functions like so:

    Private Sub saveScroll(ByVal hWnd As IntPtr)
        LockWindowUpdate(hWnd.ToInt32)

        CursorPosition.xScroll = GetScrollPos(_bind.Handle, SBS_HORZ)
        CursorPosition.yScroll = GetScrollPos(_bind.Handle, SBS_VERT)
    End Sub


    Private Sub restoreScroll(ByVal hWnd As IntPtr)
        SetScrollPos(_bind.Handle, SBS_HORZ, CursorPosition.xScroll, True)
        PostMessageA(_bind.Handle, WM_HSCROLL, _
          SB_THUMBPOSITION + &H10000 * CursorPosition.xScroll, Nothing)
        SetScrollPos(_bind.Handle, SBS_VERT, CursorPosition.yScroll, True)
        PostMessageA(_bind.Handle, WM_VSCROLL, _
          SB_THUMBPOSITION + &H10000 * CursorPosition.yScroll, Nothing)

        LockWindowUpdate(0&)
    End Sub

saveScroll gets the current scroll locations of your window and locks it. restoreScroll resets the scroll information on your window and unlocks it.

Now the heart of the program, the (other) keyUp event. This is where we tell the program to start looking for words to colorize. This has changed from previous versions where it used to be in a TextChanged event.

    Private Sub _bind_KeyUp(ByVal sender As Object, ByVal e _
             As System.Windows.Forms.KeyEventArgs) Handles _bind.KeyUp
        If e.KeyData = Keys.Space Then
            ' Update the cursor position

            update()
            ' Freeze the windows and get the scroll nfo

            saveScroll(_bind.Handle)

            ' Do any coloring

            applyColor(CursorPosition.CurrentLine)
            ' Update the RTF

            _bind.Rtf = Render()
            ' Reset the cursor

            _bind.SelectionStart = CursorPosition.Cursor
            ' Update the debug window

            debugprint(_bind.Rtf, False)

            ' Unfreeze the windows and Restore the scroll

            restoreScroll(_bind.Handle)
        End If
    End Sub

As you can see, there is not much to this event. What it says is if the last key was a space then get the cursor position, lock the windows, colorize the line and unlock the windows.

Building the data:

Parsing the color information in the header of the document is the job of readRTFColor. readRTFColor parses each color in the header and adds it to the collection of colors so it can be rebuilt later.

    '----------------------------------------------------------------------

    '     Sub: readRTFColor()

    ' Purpose: parse the color information in the header of the document

    '

    Private Function readRTFColor() As Boolean
        Dim strHeader As String = ""

        ' Get Header string

        ' I hate RegEx :-)

        '

        Dim colHeader As MatchCollection = _
          Regex.Matches(_bind.Rtf, "{\\colortbl\s?;(.*);}")

        If RTFDebug Then 
          Console.WriteLine("Colors found: " & colHeader.Count)

        If colHeader.Count = 1 Then
            strHeader = colHeader.Item(0).Groups(1).Value
            If RTFDebug Then 
              Console.WriteLine(colHeader.Item(0).Groups(1).Value)
        Else
            If RTFDebug Then Console.WriteLine("No color info in header")
            Return False
        End If

        ' Translate the string to ARGB color values

        ' I hate RegEx :-)

        '

        Dim colColors As MatchCollection = Regex.Matches(strHeader, "(\d+)")

        If Not colColors Is Nothing Then
            Dim colorCounter As Integer

            For colorCounter = 0 To colColors.Count - 1 Step 3
                Dim newColor As Color = Color.FromArgb(0, _
                                colColors.Item(colorCounter).Value, _
                                colColors.Item(colorCounter + 1).Value, _
                                colColors.Item(colorCounter + 2).Value)

                rtfColors.add(newColor.ToArgb)
            Next
        End If
    End Function

readRTFBody and readTXTBody use standard string functions to extract there information but they do one other important thing. They split the information up into lines (using the split function).

    '----------------------------------------------------------------------

    '     Sub: readRTFBody()

    ' Purpose: Read the RTF and strip off the header info, 

    '          and split it into limes.

    '          RegEx avoided here !

    '

    Private Function readRTFBody() As String
        Dim tmp As String = _bind.Rtf
        Dim bodyStart As Integer

        Dim position As Integer = InStr(tmp, "\viewkind4")
        If InStr(position, tmp, " ") < 0 Then
            bodyStart = InStr(position, tmp, "\par")
        Else
            bodyStart = InStr(position, tmp, " ")
        End If

        Dim tmpRtfBody As String = tmp.Substring(bodyStart)
        rtfBody = Split(tmpRtfBody, "\par")
    End Function

    '----------------------------------------------------------------------

    '     Sub: readTXTBody()

    ' Purpose: Split the text portion into lines

    '          RegEx avoided here !

    '

    Private Function readTXTBody() As String
        Dim tmpText As String
        Dim counter As Integer

        tmpText = _bind.Text
        txtBody = Split(tmpText, Chr(10))

        For counter = 0 To UBound(txtBody)
            If txtBody(counter) Is Nothing Then
                txtBody(counter) = ""
            End If
        Next

        If RTFDebug Then Console.WriteLine("")
        If RTFDebug Then 
          Console.WriteLine("Text lines read: " & UBound(txtBody))
    End Function

Changing the data:

Once we have the data parsed and in arrays according to its line, it's really easy to change. Basically, we just loop through the current TEXT VERSION of the line and insert the colors that we gathered from cDict.

    '----------------------------------------------------------------------

    '     Sub: applyColor

    ' Purpose: Apply a new color format to a line

    '

    Private Function applyColor(ByVal line As Integer)
        Dim colWords As MatchCollection = Regex.Matches(txtBody(line), "\w+")
        Dim Word As Match
        Dim pattern As DictionaryEntry

        rtfBody(line) = txtBody(line)

        For Each Word In colWords

            If rtfSyntax.exists(Word.Value) Then
                Dim inDict = rtfSyntax.exists(Word.Value)
                Dim colorindex As Integer

                If inDict <> -1 Then
                    rtfColors.add(inDict)
                    colorindex = rtfColors.exists(inDict)

                    rtfBody(line) = _
                       Regex.Replace(rtfBody(line), _
                       "\b" & Word.Value & "\b", _
                       "\cf" & colorindex + 1 & Word.Value & "\cf0 ")
                    If RTFDebug Then 
                       Console.WriteLine("Applying color to: " & line)
                End If
            End If
        Next

Constructing the data:

Now, all the changes to the individual lines of the files are done to the arrays then reassembled later in the render procedures. What the render procedures does is take the collected information and merge the lines back together. Like this:

    '----------------------------------------------------------------------

    '     Sub: Render()

    ' Purpose: Put the moded RTF back together

    '

    Private Function Render() As String
        Return reBuildHeader() & "\viewkind4 " & reBuildBody()
    End Function

    '----------------------------------------------------------------------

    '     Sub: reBuildHeader()

    ' Purpose: Using the colortable supplied by readRTFColor() rebuilds the 

    '          headers after all you might have added a color!

    '

    Private Function reBuildHeader() As String
        Dim thisColor As Integer
        Dim DocHead As String

        DocHead = "{\rtf1\ansi\ansicpg1252\deff0\deflang1033"
        DocHead &= "{\colortbl ;"

        For Each thisColor In rtfColors
            Dim setColor As Color = Color.FromArgb(thisColor)
            DocHead &= "\red" & setColor.R
            DocHead &= "\green" & setColor.G
            DocHead &= "\blue" & setColor.B & ";"
            If RTFDebug Then Console.WriteLine("Adding Header Color")
        Next

        DocHead &= "}"
        Return DocHead
    End Function

    '----------------------------------------------------------------------

    '     Sub: reBuildBody()

    ' Purpose: Build the moded RTF Body

    '

    Private Function reBuildBody() As String
        Dim DocBody As String = ""
        Dim rtfLine As String = ""
        Dim counter As Integer

        For counter = 0 To UBound(rtfBody)
            Dim tmp As String = rtfBody(counter)
            If tmp = "" Then tmp = " "
            DocBody &= tmp & "\par" & vbCrLf
        Next

        If RTFDebug Then 
            Console.WriteLine("RTF body lines rendered: " & UBound(rtfBody))
        Return DocBody
    End Function

So now, when the user looks at his RTF, it is in fact your RTF that was built from scratch each time.

Extra goodies:

There are two Subs that you can use to wow people. They are called changeColor. What these Subs do is modify the color information that gets rendered to the document so you can change all the blues to reds or all the greens to purple or whatever you want. And it will be instantaneous.

    '

    '     Sub: ChangeColor

    ' Purpose: Change the color of an element document wide. 

    '          Basicly this changes

    '          the info in the color table used to build the headers.  

    '    Note: This changes a color to a color

    '

    Public Sub ChangeColor(ByVal srcColor As Color, ByVal toColor As Color)
        Dim index = rtfColors.exists(srcColor.ToArgb)

        If index <> -1 Then
            rtfColors.item(index) = toColor.ToArgb
        End If
    End Sub

    '

    '     Sub: ChangeColor

    ' Purpose: Change the color of an element document wide. 

    '          Basicly this changes

    '          the info in the color table used to build the headers.  

    '    Note: This changes a index value of a color to a color

    '

    Public Function ChangeColor(ByVal index As Integer, ByVal toColor As Color)
        rtfColors.item(index) = toColor.ToArgb

        reBuildHeader()
    End Function

Well, that's about it. I'm sure I will be adding more to this as time goes on but for now enjoy.

Disclaimer

This code is free for personal use. However, if you are going to use it for commercial purposes, you need to purchase a license.

Updates

I found a few issues with the wrapper class:

  • Feb 3rd
    • Fixed demo app's docking order.
    • Fixed the '}' and '{' chars being filtered.
    • Fixed spacebar bug.
  • Feb 5th
    • Three new API functions to fix the scrolling problems.
    • Added wordwrap support.
    • Improved mouse support.
    • Added debugview.
    • Fixed top line empty bug.
  • Feb 6th
    • Added RegEx matching "<IMG.*?>" etc.
    • Added Case Insensitive option.
    • Added syntax highlighter edit form.
    • Added colorDocument method.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here