Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

A CAPTCHA Server Control for ASP.NET

0.00/5 (No votes)
31 Jan 2007 196  
A CAPTCHA control implemented as a simple, visual drag-and-drop Server Control for ASP.NET.

A CAPTCHA control implemented as a simple, drag-and-drop ASP.NET Server Control

Introduction

I'm sure everyone reading this is familiar with spam. There are two schools of thought when it comes to fighting spam:

  1. Bayesian filtering, such as POPFile.
  2. Challenge/response human verification, such as SpamArrest.

Both of these approaches have their pros and cons, of course. This article will only deal with the second technique: verifying that the data you are receiving is coming from an actual human being and not a robot or script. A CAPTCHA is a way of testing input to ensure that you're dealing with a human. Now, there are a lot of ways to build a CAPTCHA, as documented in this MSDN article on the subject, but I will be focusing on a visual data entry CAPTCHA.

There's already a great ASP.NET article on a CAPTCHA control here on CodeProject, so you may be wondering what this article is for. I wanted to rebuild that solution for the following reasons:

  • more control settings and flexibility
  • conversion to my preferred VB.NET language
  • abstracted into a full blown ASP.NET server control.

So, this article will document how to turn a set of existing ASP.NET web pages into a simple, drag and drop ASP.NET server control -- with a number of significant enhancements along the way.

Implementation

The first thing I had to deal with was the image generated by the CAPTCHA class. This was originally done with a dedicated .aspx form-- something that won't exist for a server control. How could I generate an image on the fly? After some research, I was introduced to the world of HttpModules and HttpHandlers. They are extremely powerful -- and a single HttpHandler solves this problem neatly.

All we need is a small Web.config modification in the <system.web> section:

<httpHandlers>
    <add verb="GET" path="CaptchaImage.aspx" 
       type="WebControlCaptcha.CaptchaImageHandler, WebControlCaptcha" />
</httpHandlers>

This handler defines a special page named CaptchaImage.aspx. Now, this "page" doesn't actually exist. When a request for CaptchaImage.aspx occurs, it will be intercepted and handled by a class that implements the IHttpHandler interface: CaptchaImageHandler. Here's the relevant code section:

Public Sub ProcessRequest(ByVal context As System.Web.HttpContext) _
       Implements System.Web.IHttpHandler.ProcessRequest
    Dim app As HttpApplication = context.ApplicationInstance

    '-- get the unique GUID of the captcha;

    '   this must be passed in via querystring

    Dim strGuid As String = Convert.ToString(app.Request.QueryString("guid"))

    Dim ci As CaptchaImage
    If strGuid = "" Then
        '-- mostly for display purposes when in design mode

        '-- builds a CAPTCHA image with all default settings 

        '-- (this won't reflect any design time changes)

        ci = New CaptchaImage
    Else
        '-- get the CAPTCHA from the ASP.NET cache by GUID

        ci = CType(app.Context.Cache(strGuid), CaptchaImage)
        app.Context.Cache.Remove(strGuid)
    End If

    '-- write the image to the HTTP output stream as an array of bytes

    ci.Image.Save(app.Context.Response.OutputStream, _
                              Drawing.Imaging.ImageFormat.Jpeg)

    '-- let the browser know we are sending an image,

    '-- and that things are 200 A-OK

    app.Response.ContentType = "image/jpeg"
    app.Response.StatusCode = 200
    app.Response.End()

End Sub

A new CAPTCHA image will be generated, and the image streamed directly to the browser from memory. Problem solved!

However, there's another problem. There has to be communication between the HttpHandler responsible for displaying the image, and the web page hosting the control -- otherwise, how would the calling control know what the randomly generated CAPTCHA text was? If you view source on the rendered control, you'll see that a GUID is passed in through the querystring:

<img src="CaptchaImage.aspx?guid=99fecb18-ba00-4b60-9783-37225179a704" 
     border='0'>

This GUID (globally unique identifier) is a key used to access a CAPTCHA object that was originally stored in the ASP.NET Cache by the control. Take a look at the CaptchaControl.GenerateNewCaptcha method:

Private Sub GenerateNewCaptcha()
    LocalGuid = Guid.NewGuid.ToString
    If Not IsDesignMode Then
        HttpContext.Current.Cache.Add(LocalGuid, _captcha, Nothing, _
            DateTime.Now.AddSeconds(HttpContext.Current.Session.Timeout), _
            TimeSpan.Zero, Caching.CacheItemPriority.NotRemovable, Nothing)
    End If
    Me.CaptchaText = _captcha.Text
    Me.GeneratedAt = Now
End Sub

It may seem a little strange, but it works great! The sequence of ASP.NET events is as follows:

  1. Page is rendered.
  2. Page calls CaptchaControl1.OnPreRender . This generates a new GUID and a new CAPTCHA object reflecting the control properties. The resulting CAPTCHA object is stored in the Cache by GUID.
  3. Page calls CaptchaControl1.Render; the special <img> tag URL is written to the browser.
  4. Browser attempts to retrieve the special <img> tag URL.
  5. CaptchaImageHandler.ProcessRequest fires. It retrieves the GUID from the querystring, the CAPTCHA object from the Cache, and renders the CAPTCHA image. It then removes the Cache object.

Note that there is a little cleanup involved at the end. If, for some reason, the control renders but the image URL is never retrieved, there would be an orphan CAPTCHA object in the Cache. This can happen, but should be rare in practice-- and our Cache entry only has a 20 minute lifetime anyway.

One mistake I made early on was storing the actual CAPTCHA text in the ViewState. The ViewState is not encrypted and can be easily decoded! I've switched to ControlState for the GUID, which is essential for retrieving the shared Captcha control from the Cache -- but by itself, it is useless.

CaptchaControl Properties

The CaptchaControl is a good ASP.NET citizen, and properly implements all the default ASP.NET Server Control properties. It also has a few properties of its own:

CAPTCHA control properties

Property Default Description
CacheStrategy HttpRuntime For security reasons, the CAPTCHA text is never sent to the client; it is only stored on the server. It can be stored in Session (web-farm friendly) or HttpRuntime (very fast, but local to one webserver).
CaptchaBackgroundNoise Low Amount of background noise to add to the CAPTCHA image. Ranges from None to Extreme.
CaptchaChars A-Z, 1-9 A whitelist of characters to use when building CAPTCHA text. A character will be picked randomly from this string. By default, I omit some characters likely to be confused, such as O, 0, I, 1, 8, B, etcetera.
CaptchaFont "" Font family to use for the CAPTCHA text. If not provided, a random installed font will be chosen for each character. A font whitelist is maintained internally so only known legible fonts will be used (e.g., not WingDings).
CaptchaFontWarping Low Level of warping used on each character of the CAPTCHA text. Ranges from None to Extreme.
CaptchaHeight 50 Default height of the CAPTCHA image, in pixels.
CaptchaLength 5 Number of characters used in the randomly generated CAPTCHA text.
CaptchaLineNoise None Amount of "scribble" line noise to add to the CAPTCHA image. Ranges from None to Extreme.
CaptchaMaxTimeout 90 Number of seconds that the CAPTCHA will remain valid and stored in the cache after it is generated.
CaptchaMinTimeout 3 Minimum number of seconds the user must wait before entering a CAPTCHA.
CaptchaWidth 180 Default width of the CAPTCHA image, in pixels.
UserValidated False After postback, returns True if the user entered text that matches the randomly generated CAPTCHA text. Note that the standard IValidation interface is implemented as well.
LayoutStyle Horizontal Determines if the text and input box are to the right, or below, the image. Allows greater layout flexibility.

Many of these properties have to do with the inherent tradeoff between human readability and machine readability. The harder a CAPTCHA is for OCR software to read, the harder it will be for us human beings, too! For illustration, compare these two CAPTCHA images:

The CAPTCHA on the left is generated with all "medium" settings, which are a reasonable tradeoff between human readability and OCR machine readability. The CAPTCHA on the right uses a lower CaptchaFontWarping, and a smaller CaptchaLength. If the risk of someone writing OCR scripts to defeat your CAPTCHA is low, I strongly urge you to use the easier-to-read CAPTCHA settings. Remember, just having a CAPTCHA at all raises the bar quite high.

The CaptchaTimeout property was added later to alleviate concerns about CAPTCHA farming. It is possible to "pay" humans to solve harvested CAPTCHAs by re-displaying a CAPTCHA and giving the user free MP3s or access to pornography if they solve it. However, this technique takes time, and it doesn't work if the CAPTCHA has a time-limited expiration.

Conclusion

Many thanks to BrainJar for creating his simple yet effective CAPTCHA image class. Now that I've wrapped it up into an ASP.NET server control, it should be easier than ever to simply drop on a web form, set a few properties, and start defeating spammers at their own game!

There are many more details and comments in the demonstration solution provided at the top of the article, so check it out. And please don't hesitate to provide feedback, good or bad! I hope you enjoyed this article. If you did, you may also like my other articles as well.

History

  • Monday, November 8, 2004 - Published.
  • Friday, December 17, 2004 - Version 1.1
    1. added UserValidationEvent
    2. changed defaults to be less aggressive (more user friendly)
    3. added LayoutStyle property for choice of horizontal or vertical layout
    4. changed random font approach from blacklist to whitelist
    5. corrected intermittent order-of-retrieval bug reported by Robert Sindall
    6. converted to VB.NET 2005 compatible XML comments
  • Sunday, October 29, 2006 - Version 2.0
    1. major rewrite for .NET 2.0
    2. removed dependency on Session
    3. removed dependency on ViewState
    4. uses ControlState to store GUID
    5. implemented standard IValidator
    6. complete rewrite of the renderer for more secure CAPTCHAs
    7. added more tweakable properties
    8. switched to HttpRuntime caching
    9. changed cache priority to Caching.CacheItemPriority.NotRemovable (this was a bug, fixed in the old and new versions)
  • Monday, January 29, 2007 - Version 2.1
    1. Correct length bug
    2. Correct caching bug (units were set to minutes, not seconds!)
    3. Add option to store CAPTCHA text in Session for web farms
    4. Add minimum time to prevent aggressive robots
    5. Improved response messages to display exactly why the CAPTCHA was rejected (timed out, bad entry, too fast)

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here