Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

RegEx Tester - Regular Expression Tester

4.98/5 (62 votes)
22 May 2011GPL36 min read 13   9.2K  
It aids you to develop and fully test your regular expression against a target text.
RegEx Tester Screenshot

Introduction

With RegEx Tester, you can fully develop and test your regular expression against a target text.
Its UI is designed to aid you in the RegEx development; especially the big and complex ones.
It uses and supports almost ALL of the features available in the .NET RegEx class.
About this article's writing, I have to tell you that English is not my native language and that I did my best in the redaction, grammar and spelling. You may and will find writing errors, so please tell me about them so that I can correct them.

Feature List

If you have an idea for a new feature, you could code it up and send it to me by email and I will add it here with the proper credits for your work. If you don't know how to properly code that idea, comment about it and other people (or me) can do it. Let's collaborate.

  • Asynchronous execution enabling the user to abort the execution. Even if you make a Catastrophic Backtracking mess. [by Pablo Oses]
  • Indented Input mode which strips \r \n \t \v and spaces before execution. This allows you to write those ugly, long and cryptic RegExs in an indented and spaced fashion. [by Pablo Oses]
  • Replace mode which enables the use of RegEx.Replace() function [by Pablo Oses]
  • Mono Compatible which makes it capable of running on Linux [by Pablo Oses]
  • Test Text highlighting based on results selection [by Davide Mauri]
  • F5 Hot Key to run the Test without changing the cursor position or selection. Get off of that Mouse ! [by Pablo Oses]
  • Listing of the matches showing position, length and anonymous or named capture groups [by Davide Mauri and Pablo Oses]
  • A button to "Clipboard Copy" the RegEx selecting "C# code snippet", "C# escaped string", "HTML encoded" or "Plain text" modes [by Kurt Griffiths and Pablo Oses]
  • Adjust the size of each of the 3 sections of the window (RegEx, Text and Results). [by Pablo Oses]
  • Ignore Case, MultiLine, SingleLine and Culture Invariant options [by Davide Mauri and Pablo Oses]
  • Window resizing and maximizing capability [by Kurt Griffiths and Pablo Oses]
  • Quick links to a RegEx Library and ILoveJackDaniels' CheatSheet [by Pablo Oses]
  • Find function inside the Test Text [by Pablo Oses]
  • Execution time measurements [by Pablo Oses]

What is Special About this Program?

As balazs_hideghety said in his comment, there are other popular, and great programs like RegEx buddy or Expresso that are extremely powerful and seem to be the last word on RegEx development and testing, but I still use this tool. Why? I needed a tool that helped me design HTML Extraction RegExs. For example (as of May 2009), if you evaluate this...

HTML
<td class="Frm_MsgSubject"><[^>]*?>(?<title>.*?)</a>.*?
	<td class="Frm_MsgAuthor"><[^>]*?>(?<author>.*?)</a>.*?
<td class="Frm_MsgDate"[^>]*?>(?<date>.*?)&.*?
	<td class="MsgBd BdSel ">.*?<td colspan="2">(?<body>[^<]*?)<"

... against the HTML source of this page, you would be parsing and extracting in only one operation all the comments inside this page with its corresponding title, body, date, user, etc. RegExs are really powerful when you are extracting data of real world websites. But the problem is that the RegExs needed are looooooong and extremely cryptic. You really start to need some spacing and indentation there to unofbuscate it and you need a big window with a lot of space and a test textbox able to handle big raw HTML documents so there's when this tool turns out to be really useful.

The same RegEx while I was developing it in RegExTester looked like this:

HTML
<td\sclass="Frm_MsgSubject">    <[^>]*?    >    (?<title>.*?)    </a>
.*?
<td\sclass="Frm_MsgAuthor">        <[^>]*?    >    (?<author>.*?)    </a>
.*?
<td\sclass="Frm_MsgDate"    [^>]*?    >    (?<date>.*?)    &
.*?
<td\sclass="MsgBd\sBdSel\s">    .*?    <td\scolspan="2">    (?<body>[^<]*?)    <

As you can see, I think that ugly RegEx development aid is the key feature of this program.

The Core of the Program: AsyncTest()

This is a simplified version of the function to make it more readable:

C#
// Create the options object based on the UI checkboxes
RegexOptions regexOptions = new RegexOptions();
if (cbIgnoreCase.Checked) regexOptions |= RegexOptions.IgnoreCase;
if (cbCultureInvariant.Checked) regexOptions |= RegexOptions.CultureInvariant;
if (cbMultiLine.Checked) regexOptions |= RegexOptions.Multiline;
if (cbSingleLine.Checked) regexOptions |= RegexOptions.Singleline;
if (cbIndentedInput.Checked) regexOptions |= RegexOptions.IgnorePatternWhitespace;

// Creates the RegEx engine passing the RegEx string and the options object
Regex regex = new Regex(txtRegEx.Text, regexOptions);

// This executes the Regex and collects the results
// The execution isn't done until a member of the matchCollection is read.
// So I read the Count property for the regex to really execute from start to finish
MatchCollection matchCollection = regex.Matches(rtbText.Text);
int matchesFound = matchCollection.Count;

// Also do the RegEx replacement if the user asked for it
if (cbReplaceMode.Checked)
    rtbResults.Text = regex.Replace(rtbText.Text, txtRepEx.Text);

// Add the Capture Group columns to the Results ListView
int[] groupNumbers = regex.GetGroupNumbers();
string[] groupNames = regex.GetGroupNames();
string groupName = null;

foreach (int groupNumber in groupNumbers)
{
    if (groupNumber > 0)
    {
        groupName = "Group " + groupNumber;
        if (groupNames[groupNumber] != groupNumber.ToString()) 
            groupName += " (" + groupNames[groupNumber] + ")";
        lvResult.Columns.Add(groupName, 100, HorizontalAlignment.Left);
    }
}

// Process each of the Matches!
foreach (Match match in matchCollection)
{
    //Add it to the grid
    ListViewItem lvi = lvResult.Items.Add(match.ToString());
    lvi.SubItems.Add(match.Index.ToString());
    lvi.SubItems.Add(match.Length.ToString());
    for (int c = 1; c < match.Groups.Count; c++)
    {
        lvi.SubItems.Add(match.Groups[c].Value);
    }

    //Highlight the match in the RichTextBox
    rtbText.Select(match.Index, match.Length);
    rtbText.SelectionColor = Color.Red;
}

The Asynchronous Execution Feature. Where the Fun Starts!

I first coded it using a BackgroundWorker but I had to throw it out because it seems that it's only useful when you want to abort a long loop that is IN your code... But it doesn't help you when you call an external function that takes too long to complete.

So I re-coded it from scratch using a more low level Thread managing that resulted in being more simple and clear than the previous technique once it was done.

C#
private Thread worker; // The worker that really does the execution in a separate thread.

private void MainForm_Load(object sender, System.EventArgs e)
{
    // This is a critical line.
    // It allows the other thread to access the controls of this class/object.
    Control.CheckForIllegalCrossThreadCalls = false;
}

/// <summary>
/// Handle the multiple behaviors of the Test button based on its text
/// </summary>
private void btnTest_Click(object sender, System.EventArgs e)
{
    if (btnTest.Text == STOPPED_MODE_BUTTON_TEXT)
    {
        StartTest();
    }
    else if (btnTest.Text == RUNNING_MODE_BUTTON_TEXT)
    {
        AbortTest();
    }
}

/// <summary>
/// Prepare and launch the asynchronous execution using another thread
/// </summary>
private void StartTest()
{
    // Creates the separate Thread for executing the Test
    worker = new Thread(AsyncTest);

    // After this instruction if the worker hangs and this thread exits,
    // then nobody has to wait for the worker to finish. 
    // (e.g. The worker will be aborted if the user wants to close the app.)
    worker.IsBackground = true;

    // Start the Asynchronous Test function
    worker.Start();
}

/// <summary>
/// Instructs to abort the asynchronous execution of the Test.
/// </summary>
private void AbortTest()
{
    // This generates a ThreadAbortException at the worker function AsyncTest()
    if (worker.IsAlive) worker.Abort();
}

/// <summary>
/// This is the core of the app. The RegEx execution and processing function.
/// It's being run on a separated thread.
/// </summary>
private void AsyncTest()
{
    // Every line in this function is susceptible of a ThreadAbortException
    // which is how the user is able to stop it.
    try
    {
        sbpStatus.Text = "Test running...";
        // ***************************************
        // Here is the code that you already read 
        // in the previous section of this article
        // [The core of the program: AsyncTest()]
        // ***************************************
        sbpStatus.Text = "Test success.";
    }
    catch (ThreadAbortException)
    {
        sbpStatus.Text = "Test aborted by the user.";
    }
    catch (Exception e)
    {
        sbpStatus.Text = "Test aborted by an error.";
        // Any other Exception is shown to the user
        MessageBox.Show(e.Message, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
    }
    finally
    {
        // Restore the btnText functionality
        btnTest.Text = STOPPED_MODE_BUTTON_TEXT;
    }
}

Program's History

This tool was originally written by Davide Mauri (2003). I used it A LOT at work and for personal projects. Thanks to the fact that it was open source, I started to add new features that I needed. One day the program was so different that I wanted to give all these enhancements to Davide so I contacted him by email and he gave me permission to re-release it and gave me the link to Kurt Griffith's version of the program (2006). I made a mix of his and my enhancements and polished the UI.

Recommended Links

Other Links

Article History

  • 2011-05-21 - Article adapted to the new 3.2.0.0 version of the app
  • 2009-05-03 - Article adapted to the new 3.1.0.0 version of the app
  • 2008-03-08 - Moved histories to the bottom and updated links
  • 2008-03-05 - Article adapted to the new 3.0.0.0 version of the app with async execution
  • 2008-03-04 - Article completely rewritten to show and comment pieces of code used in the project
  • 2008-03-02 - Initial article

Program History

  • 2011-04-06 - 3.2.0.0 - by Pablo Osés
    • New features
      • Detection of multiple captures inside a capture group
    • Minor changes
      • Code janitoring
      • Credits update in About Window
  • 2010-02-25 - 3.1.1.0 - by Eric Lebetsamer
    • New features
      • Export to CSV
  • 2009-05-03 - 3.1.0.0 - by Pablo Osés
    • New features
      • RegEx.Replace()
      • Context menu icons
      • C# code snippet copy
      • RichTextBoxes WordWrap
      • Execution time
      • Mono compatible
    • Bug fixes
      • Label Typos
      • New SuspendLayout techniques and a lot of code re-engineering
  • 2008-03-05 - 3.0.0.0 - by Pablo Osés 
    • New features
      • Asynchronous execution
      • Copy Feature enhanced
      • Test Textbox context menu
      • Find functions
  • 2008-03-03 - 2.0.1.0 - by Pablo Osés
    • New features
      • RegEx Cheat-Sheet 
    • Bug fixes
      • Multiline behavior
      • Performance issues 
      • Results list click event
  • 2008-03-02 - 2.0.0.0 - by Pablo Osés
    • New features
      • Group names
      • window maximize
      • Hot Keys
      • Indented Input
      • Culture Invariant
      • Resizeable Panels
  • 2006-xx-xx - 1.0.0.3 - by Kurt Griffiths 
    • New features
      • Copy and window resize
  • 2003-xx-xx - 1.0.0.3 - by Davide Mauri
    • Original version

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)