Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Keyword Highlighting with One Line of Code: Applied Use of HttpResponse.Filter in ASP.NET to Modify the Output Stream

0.00/5 (No votes)
3 Jan 2014 1  
HttpResponse.Filter post-processes the output of an ASP.NET page in order to modify the HTML document before it is sent to the client, similar to output buffering in PHP. The example wraps instances of a keyword on the page in an HTML element to have a highlighting style applied to it.

Introduction

No doubt you have seen many web pages in which the results of a keyword-search highlights the keyword in yellow, making it easy for the reader to find the keyword in the context in which it was found. There are of course many ways to approach this task.

This article discusses:

  • Implementation of the (mostly) undocumented HttpResponse.Filter property
  • Implementation of a simple search box to highlight a word or phrase on a page
  • Use of Regex.Replace with a MatchEvaluator delegate

Background

This week when I approached the implementation of keyword highlighting, I considered a few possible ways:

  1. Client-side DOM manipulation with JavaScript
  2. Search and replace on the text to which I have programmatic access
  3. An ASP.NET HTTP Module or HTTP Handler, compiled as a standalone assembly and installed in Web.config
  4. Manipulating the output stream, similar to output buffering in PHP

It was the last method that I decided to pursue, because it had the potential to operate independently of the page's code (unlike #2), wouldn't require processor-intensive client-scripting (unlike #1), and wouldn't require any server-side configuration (unlike #3).

The example site consists of a web page that displays the text from Charles Dickens' Great Expectations. In the upper-right corner of the page floats a search box into which you can enter a word or phrase. It also presents some options, such as case-sensitive searching, whole-word searching, and searching using regular expressions instead of literal text.

Screen shot of Great Expectations without highlight

When a word or phrase is entered into the search box and the button clicked, the page is shown again with the search term highlighted throughout the document.

Screen shot of Great Expectations with highlighting

Terminology

For the sake of clarity, I'll refer to the search term or keywords as the needle. Likewise, I'll refer to the text that is being searched as the haystack. This nomenclature is also used throughout the code for consistency.

Using the Code

Screen shot of Great Expectations with highlighting

Earlier in the article, I promised to add highlighting to a page with one line of code. Here is the code in context:

/// <summary>
/// Handles the Load event of the Page control.
/// </summary>
/// <param name="sender">The source of the event.</param>
/// <param name="e">The <see cref="EventArgs"/> instance containing the event data.
/// </param>
protected void Page_Load(object sender, EventArgs e)
{
    // Add some content from a resource.
    Content.Text = Properties.Resources.Great_Expectations__by_Charles_Dickens;

    if(IsPostBack)
    {
        // Implement a highlighter with one line of code:
        Response.Filter = new HighlightFilter(Response, Needle.Text)    // The magic line.
                                {
                                    IsHtml5 = false, 
                                    MatchCase = MatchCase.Checked, 
                                    MatchWholeWords = MatchWholeWords.Checked, 
                                    UseRegex = UseRegularExpressions.Checked
                                }; 

        // Don't try to highlight the search box.
        Needle.Text = string.Empty;
    }
} 

As you can see, when the Web Form is posted back, the needle is retrieved from Needle.Text. In the code-behind, we construct a HighlightFilter, passing it the HttpResponse object and the needle.

I have also set some of the properties of HighlightFilter using an object initializer. Most of the properties should be self-explanatory, like MatchCase, MatchWholeWord, and UseRegex.

The IsHtml5 property wraps instances of the needle in the <mark> element, for which it was intended. If it is false, a div with its class set to "highlight" is used instead. For greater control, one can explicitly set the values of the OpenTag and CloseTag properties. For ultimate control, you can subscribe to the Highlighting event and modify the supplied Haystack using the supplied Needle, or even subclass HighlightFilter entirely.

Of course, the usefulness of post-processing in this manner need not be limited to highlighting. Using the Filter class, one could subscribe to the Filtering event to modify the output stream, or subclass Filter and override the protected OnFilter method. There are numerous applications including:

  • obfuscation
  • minification
  • altering the output of sealed classes
  • translation (e.g. RSS ? HTML)
  • insertion of common code (e.g. reverse master page)

If you find other uses, please share with a comment.

How It Works

I would need to somehow intercept the output stream, Page.Response.OutputStream.

A bit of searching led me to the Filter property of the HttpResponse class. The documentation for the property leaves quite a bit to the imagination. The property is assigned a Stream that filters writes, and the example refers to a magical (i.e. undocumented) UpperCaseFilterStream that takes the property itself as a parameter to the constructor, and ta da! Hmm… (Had I bothered to find and unpack Samples.AspNet.CS.Controls maybe I would have solved this one.

I created the Filter class, which takes the HttpResponse object as a parameter to the constructor. The class itself inherits Stream, but the implementation of the abstract class simply invokes methods and properties of the HttpResponse object's OutputStream stream, with the exception of Write(byte[] buffer, int offset, int count). The overridden Write method decodes the buffer to a string using the response's ContentEncoding, applies a filter, and re-encodes and writes out the buffer to the OutputStream.

The Filter class by itself doesn't do anything useful, but its potential is unlimited. To make it filter something, one needs to subclass it and override OnFilter, or instantiate it and subscribe to the Filtering event, which passes a FilterEventArgs object containing the buffered string to be manipulated.

For example, to implement needle highlighting, HighlightFilter inherits Filter, overriding OnFilter and adding some properties and the Highlighting event.

The new OnFilter method uses Regex.Replace to replace instances of the needle in the haystack. It does this using the invocation that takes a MatchEvaluator, a delegate that is called for each match that is found. This is perfect for this use because if MatchWholeWords is true, the characters that bound the needle will be replaced in kind, and the case of the match will not be altered (i.e. using String.Replace would replace the casing of all matches with that of the needle.

If UseRegex is false, the needle is simply escaped with Regex.Escape instead of using an alternate means of searching and replacing.

I was initially concerned that using Regex for replacement with a MatchEvaluator would be prohibitively slow, but replacement of common words in Great Expectations (just over one megabyte) takes a few millisecond on my Core i7-2600K and hopefully not too much more on a typical web server. Interestingly, enabling "Match Whole Word", increases this to several seconds.

Points of Interest

In my first attempt, I derived a new class from MemoryStream and assigned it to the Filter property. I overrode the Write method and manipulated it by wrapping instances of the keyword in a new element to which as CSS style could be assigned.

Inspection of the contents of the stream demonstrated that it worked quite nicely, and the class called base.Write to complete the task, but this resulted in zero bytes sent to the client. The sample application suggests maybe one needs to write out the bytes individually. Instead, I used my class to wrap the output stream.

Acknowledgements

Thank you to The Gutenberg Project for the free distribution of Great Expectations and over 36,000 other works; and of course to Charles Dickens (1812-1870) himself.

History

  • October 31, 2011: Version 1.0.0.x
  • January 3, 2013: Modified title to better describe the nature of the topic

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here