Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / All-Topics

Extension Me Strings: ToString Enhancements with Regular Expressions

3.50/5 (2 votes)
29 Mar 2012CPOL3 min read 5.6K  
A walkthrough on creating a ToString extension method for easily extracting a substring matching a given regular expression (RegEx).

Welcome to the first entry on System.String for the Extension Me series. The Extension Me series focuses on various usable extension method implementations for the .NET Framework. This entry will be a walkthrough on creating a ToString extension method for easily extracting a substring matching a given regular expression (RegEx). General regular expression knowledge is assumed as well as basic String manipulation in C#.

Implementation & Use

Diving right in, here is the implementation:

C#
public static String ToString(this String @this, String regexPattern)
{
    return (Regex.IsMatch(@this, regexPattern) ?
             Regex.Match(@this, regexPattern).Value : null);
}

This extension method is for System.String objects. It can be used like this:

string phoneRegex = @"^[- .]?(\([2-9]\d{2}\)|[2-9]\d{2})[- .]?\d{3}[- .]?\d{4}$"
string phoneRaw = "ea011-122-2333klai";
string phone = phoneRaw.ToString(phoneRegex);

Above is an example of the extension method in use. First, the regex pattern is stored. This pattern is used to match particular phone number formats. The raw data for the phone number is then stored. The raw data of the phone number cannot be used in its entirety to make a valid phone call therefore, it must be scrubbed. This occurs with the call to phoneRaw.ToString(phoneRegex). This call returns the string011-122-2333”. Using the extension method, the phone number went from a raw, unusable form, to a valid format using a single method call.

Implementation Details

The ToString extension method created above uses the Regex class. This class provides various methods for retrieving information regarding the Regex’s anatomy such as groups, group numbers, and pattern matches. The ToString extension method uses two of these: IsMatch and Match.

Regex.IsMatch

This is a static method returning a Boolean value indicating whether a match is found in the specified string using the specified regular expression. In the extension method created above, this is used to decide whether to proceed with returning the value of Regex.Match, or simply return null.

Regex.Match

This is a static method returning a Match object containing information about the first occurrence of the specified regular expression in the specified string. The extension method created above uses this to return the Value of the match. This value contains the string representation of the matched substring found in the input string.

Originally Perceived Enhancements

There are a few improvements that can be made to the original extension method. One in particular is the use of the static method Matches. This returns all of the successful matches of a regular expression in a given string. The user can then determine which match to use. Below is an example:

C#
public static String ToString(this String @this, String regexPattern, int matchIndex)
{
    var matches = Regex.Matches(@this, regexPattern);

    return ( matches.Count > matchIndex ?
             matches[matchIndex].Value : null );
}

Above is another version of the ToString extension method that accepts a RegEx pattern and a match index. The method retrieves all of the matches and, if valid, returns the desired match’s value. This way, the user is able to do this:

string phoneRegex = @"^?\d{3}";
string phoneRaw = "(011) 122-2333";
string areaCode = phoneRaw.ToString(phoneRegex, 0);

Above shows the enhanced version of the ToString extension method. It is used to obtain the area code “011” of the phone number by retrieving the first occurrence of 3 consecutive digits.

Conclusion

The two versions of the ToString extension method created above allow, with just a few lines of code, the ability to do all sorts of string manipulation and input validation using regular expressions.

This has been another entry for the Extension Me series. Stay tuned for further entries and check out all of the articles that make up the Extension Me series: http://calebmcelrath.wordpress.com/category/net/extension-me/.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)