Introduction
There are many applications for the need to parse, normalize and validate phone numbers. This article provides a single structured type that allows you to easily parse, normalize, and validate NANP (North American Numbering Plan) phone numbers. It should support global phone numbers as well, but I haven't tested it with any global numbers. It also has support for phonetic phone numbers (such as 1-800-MY-PHONE).
Background
I have run into many situations where I have needed a simple and easy way to parse, normalize, and validate phone numbers from user input on web forms, thus the PhoneNumber
structure was born. It started out as a simple structure that would only parse a limited number of formats, and over time I have added more advanced parsing abilities to it, and even added phonetic phone number support. It does still have a few limitations when parsing phone numbers with extension codes, but for the most part has been able to handle almost every format I've tossed at it.
The structure was originally built to normalize and validate NANP numbers only, but it should be able to handle international phone numbers as well. About the only exception would be if a country code uses an alpha-numeric code instead of just a numeric code. I have not tested any international phone numbers with this structure though.
If you are interested in learning more about the NANP, good old Wikipedia is probably the best place to read about it.
Using the Code
The code file is fully documented and has quite a few examples included in it, so I will only provide some basic examples here to show off what the structure is capable of.
PhoneNumber phone = new PhoneNumber(1, 800, 222, 2222, 1234);
Console.WriteLine(phone);
PhoneNumber phone = new PhoneNumber("1-800-222-2222 ex 1234");
Console.WriteLine(phone);
PhoneNumber phone = new PhoneNumber("800 MY-TEST ex 1234", true);
Console.WriteLine(phone);
PhoneNumber phone = new PhoneNumber("(800) 222.2222");
Console.WriteLine("Is NANP valid? {0}", phone.IsNanpValid);
The output format above is using the default format output (the "P
" format string
), but the output can be customized to pretty much any format you want. Here is an overview of format string
s that are accepted:
Token |
Description |
D |
Automatically formats the phone number based on its values. Possible formats: c-a-x-s E, c-a-x-s, (a) x-s, (a) x-s E, x-s E, and x-s. |
G |
Alias of "c-a-x-s e". |
N |
Plain numerical phone number, with no special formatting characters (same as caxs e and/or caxs and/or axs e and/or axs and/or xs e and/or xs). |
P |
Phonetic representation, if a phonetic string is specified. If there is no phonetic string , this format string acts the same as the "D" format string . Possible formats: c-a-p E, c-a-p, (a) p E, (a) p, p E, p, c-a-x-s E, c-a-x-s, (a) x-s E, (a) x-s, x-s E, and x-s. |
F |
Alias of "P (x-s)", or "D" if there is no phonetic string . |
c |
Any place this character occurs in the format string will be replaced by the country code. |
a |
Any place this character occurs in the format string will be replaced by the 3 digit NPA code. |
x |
Any place this character occurs in the format string will be replaced by the 3 digit NXX code. |
s |
Any place this character occurs in the format string will be replaced by the 4 digit station code. |
e |
Any place this character occurs in the format string will be replaced by the extension code. |
E |
Any place this character occurs in the format string will be replaced by the extension code prefixed by "ext ". |
p |
Any place this character occurs in the format string will be replaced by the phonetic representation of the phone number. If there is no phonetic string , this format string is the same as "x-s". If the phonetic string is not seven characters, then the NPA and NXX codes are used to make it seven characters. |
All other characters are left as-is. Note that tokens are cap-sensitive.
Points of Interest
I originally started out using a regular expression to parse the phone number from a string
, but after adding more and more flexibility to the formats, the expression became a garbled unreadable mess, so I opted to go for the more readable and flexible, but slightly slower manual parsing. All of the parsing work is done in the private
member _Parse()
.
If you want to use this structure to scrub a database of phone numbers and normalize them, you will probably want to edit the _Parse
function to relax its restrictions a little. I've noticed several times when I've scrubbed a database, I've had to comment out the spot where it checks to make sure there are no more than seven phonetic characters (lines 1306 - 1309) and also the spot where it checks an array of allowed characters in order to get it to successfully scrub all of the database values (lines 1317 - 1320).
The structure implements the IComparable
, IFormattable
, and ISerializable
interfaces and also provides operator overrides for ==
, !=
, <
, >
, <=
, and >=
. I have also provided a type converter for the structure, although I haven't had a chance to fully test it yet.
I didn't provide a project/solution file because I use MonoDevelop (yes, I am developing .NET in a Linux environment), and figured including the MonoDevelop project/solution files would be useless for most people. The PhoneNumberTest.cs file is just a simple console application that shows how to use the structure. All of the examples on how to use the code are in the documentation in the PhoneNumber.cs file.
TODO
- Implement a
ParseExact()
method (much like the DateTime.ParseExact()
method)
- Implement a
TryParseExact()
method (like the DateTime.TryParseExact()
method)
- Test and make sure that the structure supports international phone numbers, and possibly add international validation
- Improve the extension parsing to be less restrictive
History
- 24th December, 2008: Initial post