Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Find and Replace with Regular Expressions

4.14/5 (14 votes)
6 Apr 2007CPOL4 min read 1   2.1K  
An article on using regular expressions to implement Find and Replace functionality
Screenshot - FindAndReplace.gif

Introduction

This article demonstrates how to use regular expressions and RegEx class (System.Text.RegularExpressions namespace) to build the find and replace functionality found in most of the text editors and Word processors. The functionalities like whole word search or case sensitive/insensitive search can be implemented much easier using Regular Expressions than using any other methods. To demonstrate further, I included a wild card search. In wild card search (as you know), you can use * to represent a group of characters and ? to represent a single character. For example, if you enter a* and check "Use wildcards" checkbox, all words starting with a are selected. In addition to this, I have included a regular expression search, which helps you if you are a regular expression freak.

A word of caution though. This article is not at all a regular expression reference. The article and accompanying code use the regular expressions at a very basic level. If you want to learn about writing and understanding regular expressions, I would recommend An Introduction to Regular Expressions by Uwe Keim or The 30 Minute Regex Tutorial by Jim Hollenhorst.

Designing the Form

This project has only one form named FindAndReplaceForm. No other forms or classes. So let's start by designing the form. The form contains a multi-line TextBox, contentTextBox, in which the text is searched. The text to be searched is entered in another TextBox, searchTextBox and the text to replace is entered in replaceTextBox. Besides these textboxes, the form contains four CheckBoxes and three buttons. The below table explains all controls in the form:

NameTypeProperty to SetProperty ValueDescription
contentSearchBoxTextBoxMultiLineTrue
Label1LabelTextSearch:This label is not used in any coding. Hence the default name is not changed
Label2LabelTextReplace:This label is not used in any coding. Hence the default name is not changed
searchTextBoxTextBox
replaceTextBoxTextBox
matchWholeWordCheckBoxCheckBoxTextMatch whole word
matchCaseCheckBoxCheckBoxTextMatch case
useWildcardsCheckBoxCheckBoxTextUse Wildcards
useRegulatExpressionCheckBoxCheckBoxTextUse Regular Expressions
findButtonButtonTextFind
replaceButtonButtonTextReplace
replaceAllButtonButtonTextReplace All

Once you complete the form design, it should look like the screen shot above.

Writing the Code

As you got an idea of controls placed on the form, you can have a look at the code. Let's start with the class level variables:

C#
// Declare the regex and match as class level variables
// to make happen find next
private Regex regex;
private Match match;

// variable to indicate finding first time
// or is it a find next
private bool isFirstFind = true;

Now let's examine the code of the simplest (arguably, of course) functionality to understand - the Replace All. See code below:

C#
// Click event handler of replaceAllButton
private void replaceAllButton_Click(object sender, EventArgs e)
{
	Regex replaceRegex = GetRegExpression();
	String replacedString;

	// get the current SelectionStart
	int selectedPos = contentTextBox.SelectionStart;

	// get the replaced string
	replacedString = replaceRegex.Replace
		(contentTextBox.Text, replaceTextBox.Text);

	// Is the text changed?
	if (contentTextBox.Text != replacedString)
	{
		// then replace it
		contentTextBox.Text = replacedString;
		MessageBox.Show("Replacements are made.   ", Application.ProductName,
		MessageBoxButtons.OK, MessageBoxIcon.Information);

		// restore the SelectionStart
		contentTextBox.SelectionStart = selectedPos;
	}
	else // inform user if no replacements are made
	{
		MessageBox.Show(String.Format("Cannot find '{0}'.   ", 
		searchTextBox.Text),
		Application.ProductName, MessageBoxButtons.OK,
		MessageBoxIcon.Information);
	}

	contentTextBox.Focus();
}

The GetRegExpression function returns an instance of Regex class, depending on text entered by the user in the form and checkboxes selected. Once we get this instance, we can use Replace method to make the replacements. Then our job is done.

Now let's examine the GetRegExpression function. This function is called from most of the methods in this article:

C#
// This function makes and returns a RegEx object
// depending on user input
private Regex GetRegExpression()
{
	Regex result;
	String regExString;

	// Get what the user entered
	regExString = searchTextBox.Text;

	if (useRegulatExpressionCheckBox.Checked)
	{
		// If regular expressions checkbox is selected,
		// our job is easy. Just do nothing
	}
	// wild cards checkbox checked
	else if (useWildcardsCheckBox.Checked)
	{
		// multiple characters wildcard (*)
		regExString = regExString.Replace("*", @"\w*");

		// single character wildcard (?)
		regExString = regExString.Replace("?", @"\w");

		// if wild cards selected, find whole words only
		regExString = String.Format("{0}{1}{0}",  @"\b", regExString);
	}
	else
	{
		// replace escape characters
		regExString = Regex.Escape(regExString);
	}

	// Is whole word check box checked?
	if (matchWholeWordCheckBox.Checked)
	{
		regExString = String.Format("{0}{1}{0}",  @"\b", regExString);
	}

	// Is match case checkbox checked or not?
	if (matchCaseCheckBox.Checked)
	{
		result = new Regex(regExString);
	}
	else
	{
		result = new Regex(regExString, RegexOptions.IgnoreCase);
	}

	return result;
}

From the code listing above, it is clear that the GetRegExpression function does most of the important jobs.

This is all that we need to do to implement the Replace All functionality. Now let's examine how the Find functionality is implemented.

C#
// Click event handler of find button
private void findButton_Click(object sender, EventArgs e)
{
	FindText();
}

// finds the text in searchTextBox in contentTextBox
private void FindText()
{
	// Is this the first time find is called?
	// Then make instances of RegEx and Match
	if (isFirstFind)
	{
		regex = GetRegExpression();
		match = regex.Match(contentTextBox.Text);
		isFirstFind = false;
	}
	else
	{
		// match.NextMatch() is also ok, except in Replace
		// In replace as text is changing, it is necessary to
		// find again
		//match = match.NextMatch();
		match = regex.Match(contentTextBox.Text, match.Index + 1);
	}

	// found a match?
	if (match.Success)
	{
		// then select it
		contentTextBox.SelectionStart = match.Index;
		contentTextBox.SelectionLength = match.Length;
	}
	else // didn't find? bad luck.
	{
		MessageBox.Show(String.Format("Cannot find '{0}'.   ", 
		searchTextBox.Text),
		Application.ProductName, MessageBoxButtons.OK,
		MessageBoxIcon.Information);
		isFirstFind = true;
	}
}

From the click event handler of findButton, the FindText method is called. The FindText is called from Replace also. That's why I made it a separate function instead of writing the code in the event handler itself.

Now the only functionality that remains to explore is Replace. Let's complete that too:

C#
// Click event handler of replaceButton
private void replaceButton_Click(object sender, EventArgs e)
{
	// Make a local RegEx and Match instances
	Regex regexTemp = GetRegExpression();
	Match matchTemp = regexTemp.Match(contentTextBox.SelectedText);

	if (matchTemp.Success)
	{
		// check if it is an exact match
		if (matchTemp.Value == contentTextBox.SelectedText)
		{
			contentTextBox.SelectedText = replaceTextBox.Text;
		}
	}

	FindText();
}

So, before winding up the code listing, a small task is pending. What to do with the isFirstFind variable? We declared this as a private variable and checked its value in FindText to see whether the user is pressing the Find button for the first time or not. Then we set its value to false, if it is the first time so that the next find will be considered as find next. Again, we set its value to true, if no match is found for a search. Is this enough? Definitely, no. The problem is how we can find that the user completed a search and when we can start from the beginning again? The method I followed is if the searchTextBox or any of the checkboxes is changed, it initializes a new search. This may not be the best approach, but hope it satisfies most of the users. See the code listing below:

C#
// TextChanged event handler of searchTextBox
// Set isFirstFind to true, if text changes
private void searchTextBox_TextChanged(object sender, EventArgs e)
{
	isFirstFind = true;
}

// CheckedChanged event handler of matchWholeWordCheckBox
// Set isFirstFind to true, if check box is checked or unchecked
private void matchWholeWordCheckBox_CheckedChanged(object sender, EventArgs e)
{
	isFirstFind = true;
}

// CheckedChanged event handler of matchCaseCheckBox
// Set isFirstFind to true, if check box is checked or unchecked
private void matchCaseCheckBox_CheckedChanged(object sender, EventArgs e)
{
	isFirstFind = true;
}

// CheckedChanged event handler of useWildcardsCheckBox
// Set isFirstFind to true, if check box is checked or unchecked
private void useWildcardsCheckBox_CheckedChanged(object sender, EventArgs e)
{
	isFirstFind = true;
}

That's all about the code.

Conclusion

You can implement all these functionalities without using Regular Expressions. However, using regular expressions results in much simpler and maintainable code. This article explores only features of RegEx class that are needed for Find and Replace functionality. So, a few important methods like Matches or Split are not covered. And as I mentioned earlier, this article can never be used as a reference to Regular Expressions.

History

  • 1st April, 2007 - First version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)