Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / regular-expression

Testing number inputs with special symbols

4.17/5 (3 votes)
12 Mar 2014CPOL 7.6K  
Tip for testing numeric inputs with special Unicode symbols.
Special digits verification against regular expressions.

Digits in Regular Expression

Developers in most cases use regular expressions to verify number inputs.

There are two ways to match a digit via regular expression:

- [0-9] matches an Arabic numeral, i.e. 0,1,2,3,4,5,6,7,8,9;

- \d matches a Unicode number.

In addition to Arabic numerals Unicode contains more than 300 numbers from different cultures:

0: 0,٠,۰,߀,०,০,੦,૦,୦,௦,౦,೦,൦,๐,໐,༠,၀,႐,០,᠐,᥆,᧐,᭐,᮰,᱀,᱐,꘠,꣐,꤀,꩐,0
1: 1,١,۱,߁,१,১,੧,૧,୧,௧,౧,೧,൧,๑,໑,༡,၁,႑,១,᠑,᥇,᧑,᭑,᮱,᱁,᱑,꘡,꣑,꤁,꩑,1
2: 2,٢,۲,߂,२,২,੨,૨,୨,௨,౨,೨,൨,๒,໒,༢,၂,႒,២,᠒,᥈,᧒,᭒,᮲,᱂,᱒,꘢,꣒,꤂,꩒,2
3: 3,٣,۳,߃,३,৩,੩,૩,୩,௩,౩,೩,൩,๓,໓,༣,၃,႓,៣,᠓,᥉,᧓,᭓,᮳,᱃,᱓,꘣,꣓,꤃,꩓,3
4: 4,٤,۴,߄,४,৪,੪,૪,୪,௪,౪,೪,൪,๔,໔,༤,၄,႔,៤,᠔,᥊,᧔,᭔,᮴,᱄,᱔,꘤,꣔,꤄,꩔,4
5: 5,٥,۵,߅,५,৫,੫,૫,୫,௫,౫,೫,൫,๕,໕,༥,၅,႕,៥,᠕,᥋,᧕,᭕,᮵,᱅,᱕,꘥,꣕,꤅,꩕,5
6: 6,٦,۶,߆,६,৬,੬,૬,୬,௬,౬,೬,൬,๖,໖,༦,၆,႖,៦,᠖,᥌,᧖,᭖,᮶,᱆,᱖,꘦,꣖,꤆,꩖,6
7: 7,٧,۷,߇,७,৭,੭,૭,୭,௭,౭,೭,൭,๗,໗,༧,၇,႗,៧,᠗,᥍,᧗,᭗,᮷,᱇,᱗,꘧,꣗,꤇,꩗,7
8: 8,٨,۸,߈,८,৮,੮,૮,୮,௮,౮,೮,൮,๘,໘,༨,၈,႘,៨,᠘,᥎,᧘,᭘,᮸,᱈,᱘,꘨,꣘,꤈,꩘,8
9: 9,٩,۹,߉,९,৯,੯,૯,୯,௯,౯,೯,൯,๙,໙,༩,၉,႙,៩,᠙,᥏,᧙,᭙,᮹,᱉,᱙,꘩,꣙,꤉,꩙,9  

Using these Unicode numbers it is possible to test number inputs on correctness.
For example, in most cases it is expected that a phone number will contain only Arabic numbers. It is easy to check by providing special symbols, for example, "١٢٣" instead of "123", or some Indian Unicode numbers:  (0), (1), (2), etc.

Below is an example of a valid Microsoft check for an Azure account recovery phone:

Image 1

Code to generate special digit symbols

.NET considers[0-9] and \d as different expressions, below is the C# script to find all Unicode numbers:

C#
 var stringBuilder = new StringBuilder();
 
 var digitRegex = new Regex(@"\d");
 var charDigitGroups = Enumerable.Range(Char.MinValue, Char.MaxValue)
                                 .Select(Convert.ToChar)
                                 .Where(ch => digitRegex.IsMatch(ch.ToString()))
                                 .GroupBy(ch => Char.GetNumericValue(ch));
 
foreach (var charGroup in charDigitGroups)
{
      string joinedValues = String.Join(",", charGroup);
      string rowResult = String.Concat(charGroup.Key.ToString(), ": ", joinedValues);
      stringBuilder.AppendLine(rowResult); 
} 

Some languages like JavaScript do not support Unicode in regular expressions by default, so there \d is the same as [0-9]. Nevertheless it is useful to check applications on the Unicode digital input independent on the realization details.

Related Links

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)