Welcome to the first entry on System.String
for the Extension Me series. The Extension Me series focuses on various usable extension method implementations for the .NET Framework. This entry will be a walkthrough on creating a ToString
extension method for easily extracting a substring matching a given regular expression (RegEx). General regular expression knowledge is assumed as well as basic String manipulation in C#.
Implementation & Use
Diving right in, here is the implementation:
public static String ToString(this String @this, String regexPattern)
{
return (Regex.IsMatch(@this, regexPattern) ?
Regex.Match(@this, regexPattern).Value : null);
}
This extension method is for System.String
objects. It can be used like this:
string phoneRegex = @"^[- .]?(\([2-9]\d{2}\)|[2-9]\d{2})[- .]?\d{3}[- .]?\d{4}$"
string phoneRaw = "ea011-122-2333klai";
string phone = phoneRaw.ToString(phoneRegex);
Above is an example of the extension method in use. First, the regex pattern is stored. This pattern is used to match particular phone number formats. The raw data for the phone number is then stored. The raw data of the phone number cannot be used in its entirety to make a valid phone call therefore, it must be scrubbed. This occurs with the call to phoneRaw.ToString(phoneRegex)
. This call returns the string
“011-122-2333
”. Using the extension method, the phone number went from a raw, unusable form, to a valid format using a single method call.
Implementation Details
The ToString
extension method created above uses the Regex
class. This class provides various methods for retrieving information regarding the Regex’s anatomy such as groups, group numbers, and pattern matches. The ToString
extension method uses two of these: IsMatch
and Match
.
Regex.IsMatch
This is a static
method returning a Boolean value indicating whether a match is found in the specified string
using the specified regular expression. In the extension method created above, this is used to decide whether to proceed with returning the value of Regex.Match
, or simply return null
.
Regex.Match
This is a static
method returning a Match
object containing information about the first occurrence of the specified regular expression in the specified string
. The extension method created above uses this to return the Value
of the match. This value contains the string
representation of the matched substring found in the input string
.
Originally Perceived Enhancements
There are a few improvements that can be made to the original extension method. One in particular is the use of the static
method Matches. This returns all of the successful matches of a regular expression in a given string
. The user can then determine which match to use. Below is an example:
public static String ToString(this String @this, String regexPattern, int matchIndex)
{
var matches = Regex.Matches(@this, regexPattern);
return ( matches.Count > matchIndex ?
matches[matchIndex].Value : null );
}
Above is another version of the ToString
extension method that accepts a RegEx pattern and a match index. The method retrieves all of the matches and, if valid, returns the desired match’s value. This way, the user is able to do this:
string phoneRegex = @"^?\d{3}";
string phoneRaw = "(011) 122-2333";
string areaCode = phoneRaw.ToString(phoneRegex, 0);
Above shows the enhanced version of the ToString
extension method. It is used to obtain the area code “011
” of the phone number by retrieving the first occurrence of 3 consecutive digits.
Conclusion
The two versions of the ToString
extension method created above allow, with just a few lines of code, the ability to do all sorts of string
manipulation and input validation using regular expressions.
This has been another entry for the Extension Me series. Stay tuned for further entries and check out all of the articles that make up the Extension Me series: http://calebmcelrath.wordpress.com/category/net/extension-me/.