Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

HTML syntax highlighting with the Rich Edit control

0.00/5 (No votes)
18 Apr 2002 1  
An extension to CRichEditCtrl to illustrate simple syntax highlighting

Sample Image - demo.gif

Introduction

This control is a fairly simple extension of CRichEditCtrl that provides basic syntax highlighting for HTML code. It does this in two different ways, passively while you type and actively as a forced parse of the text. At the moment, this article is just an introduction into what was required to make it work. If some of the problems are ironed out then it may become a production strength control with documentation to the same level as my other controls (CButtonSSL and CTabCtrlSSL).

Rules

The control follows a few simple rules to perform its syntax highlighting as follows:

  1. Anything starting with '<!--' and ending with '-->' is a Comment.
  2. Anything between two double quotes (" ") is Quoted Text.
  3. Anything from '<' to '>' (except comments) is a Tag.
  4. Anything else is Normal Text.

Strategy

In order to provide the syntax highlighting, the control handles the WM_CHAR message to determine when a key has been pressed (at this point it's already been added to the control). It keeps track of the last four characters entered, so that it can determine when a comment has been started or finished, but generally it looks at the character just typed. For each item (Comment, Quoted Text or Tag) a flag is updated, which indicates if we're waiting for an end delimiter. So when the start delimiter is found, the word character format is changed, when the end delimiter is found the flag is cleared, so the next character entered will update the word character format.

If the position that the text has been entered is not the end of the control, then an edit flag is set to indicate that text is being inserted into the line, rather added to the end of the existing text. After the OnChar handler, the ENM_CHANGE* message handler is called. If the edit flag is set, then the whole line is parsed so that new elements can be colored correctly, i.e. adding quoted text inside a tag.

*Aside Handling the ENM_CHANGE message.

In order to handle change notifications you must set the event mask for the rich edit control. This is performed in PreSubclassWindow and OnCreate using the following code:

// Set the event mask to include ENM_CHANGE

long lMask = GetEventMask();
lMask |= ENM_CHANGE;
SetEventMask(lMask);

The OnChange handler checks the edit flag and if its set, it calls ParseLine() to parse the current line.

In addition to ParseLine for single line parsing, there is also the function ParseAllLines, which parses all of the text in the control, but not by recursively calling ParseLine. ParseAllLines parses all of the text in context, so that if an item is started, but not terminated on a line, then the next line is colored accordingly, thus providing support for multi-line comments, quoted text and tags. The strategy for this is basically the same as for parsing a single line, but the wait flags are carried over from one line to the next.

Coloring

In order to change the colors, we need to store a CHARFORMAT structure and use one of the Rich Edit character format functions, SetSelectionCharFormat, SetWordCharFormat or SetDefaultCharFormat. CHtmlRichEditCtrlSSL uses SetDefaultCharFormat in OnCreate and PreSubclassWindow, but everything else uses SetWordCharFormat.

The approach to coloring is to find out where in the text the bit you want to color is, store the current selection, select the text you're interested in, set its character format and then reset the original selection. The following line would color the whole of the current line red:

// Populate a CHARFORMAT structure accordingly

CHARFORMAT cf;
cf.cbSize = sizeof(CHARFORMAT);
cf.dwMask = CFM_COLOR;
cf.dwEffects = 0;
cf.crTextColor = RGB(255, 0, 0);

// Store the current selection

CHARRANGE crCurrent;
GetSel(crCurrent);

// The character position of the start of the line:

long lSelStart = LineIndex(nLineIndex);

// Get the text for the current line

CString strLineText;
int nLineLength = LineLength(lSelStart);
int nRead = GetLine(nLineIndex, strLineText.GetBuffer(nLineLength + 3), 
         nLineLength + 1);
strLineText.ReleaseBuffer(nRead);

// Get the end point for the selection

long lSelEnd = lSelStart + strLineText.GetLength();

// Now colour the line

SetWordCharFormat(cf);

Automatic Updates

In order to provide multi-line comments, etc an automatic updating feature was added that calls to color all of the text in context. This is simply done with a timer that calls ParseAllLines in the OnTimer handler. By default this occurs every 10 seconds, though the function AutoParse allows you to turn this feature off or to adjust the interval between refreshes.

Unfortunately, this approach is flawed when there are more lines of text than are visible. When the parsing is complete and the selection is restored, the control scrolls so that the current line is the first visible line of the control. This causes the control to scroll as you type, though typing isn't affected because the insertion point is still the same.

I tried to overcome this by storing the first visible line at the start of parsing and then scrolling back to it afterwards, but this still resulted in annoying scroll behavior, even after adding calls to SetRedraw(FALSE) before parsing and SetRedraw(TRUE) followed by Invalidate(FALSE) after the call to LineScroll.

If anyone has any suggestion for improvements to this automatic update process, or a different strategy for coloring in context, I'd be glad to hear them.

Class Interface

// Construction/Destruction

public:
    // Default constructor

    CHtmlRichEditCtrlSSL();
    // Default destructor

    virtual ~CHtmlRichEditCtrlSSL();

public:
// Character format functions

    // Sets the character format to be used for Tags

    void SetTagCharFormat(int nFontHeight = 8, 
        COLORREF clrFontColour = RGB(128, 0, 0), 
        CString strFontFace = _T("Courier New"),
        bool bParse = true);
    // Sets  the character format to be used for Tags

    void SetTagCharFormat(CHARFORMAT& cfTags, bool bParse = true);
    // Sets the character format to be used for Quoted text

    void SetQuoteCharFormat(int nFontHeight = 8, 
        COLORREF clrFontColour = RGB(0, 128, 128), 
        CString strFontFace = _T("Courier New"),
        bool bParse = true);
    // Sets  the character format to be used for Quoted text

    void SetQuoteCharFormat(CHARFORMAT& cfQuoted, bool bParse = true);
    // Sets the character format to be used for Comments

    void SetCommentCharFormat(int nFontHeight = 8, 
        COLORREF clrFontColour = RGB(0, 128, 0), 
        CString strFontFace = _T("Courier New"),
        bool bParse = true);
    // Sets  the character format to be used for Comments

    void SetCommentCharFormat(CHARFORMAT& cfComments, bool bParse = true);
    // Sets the character format to be used for Normal Text

    void SetTextCharFormat(int nFontHeight = 8, 
        COLORREF clrFontColour = RGB(0, 0, 0), 
        CString strFontFace = _T("Courier New"),
        bool bParse = true);
    // Sets  the character format to be used for Normal Text

    void SetTextCharFormat(CHARFORMAT& cfText, bool bParse = true);

// Parsing functions

    // Enables/disables automatic parsing of all lines

    void AutoParse(bool bParse = true, UINT uiInterval = 10000);
    // Parses the specified line to colour it accordingly. Defaults to the

    // current line.

    void ParseLine(int nLineIndex = -1);
    // Parses all lines in the control, colouring each line accordingly.

    void ParseAllLines();
    
// Miscellaneous functions

    // Loads the contents of the specified file into the control. Replaces

    // the existing contents and parses all lines.

    void LoadFile(CString& strPath);

// Overrides

    // ClassWizard generated virtual function overrides

    //{{AFX_VIRTUAL(CHtmlRichEditCtrlSSL)

    protected:
    virtual void PreSubclassWindow();
    //}}AFX_VIRTUAL

Summary

If nothing else, I suppose this article gives a simple introduction to coloring text in the Rich Edit Control. It also shows an approach to syntax highlighting that isn't dependant on a keyword list.

If it helps anyone out, then that's great, if not maybe I'll get some help with the automatic refresh ;) When that's fixed I'll document it properly and do a full demo that exercises the class to it's full potential.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here