Here we explore doing validation on fields using the Visual FA lexing engine.
Introduction
While lexers are typically used to break apart text into tokens they can also be used for validation, which is particularly useful when a field can have multiple valid formats, and you need to know which format was used.
Background
You'll probably want to read up on Visual FA here.
Here we will be using the runtime engine to lex code, although you can easily generate the runners instead of referencing the runtime library.
What we're going to do is explore field validation. We'll be validating a field that can be either an email address or a phone number. We'll be using two regular expressions to do this, and building a lexer from that. We will then use runners with the lexer to validate text.
Using the code
First let's take a look at our field validation routine:
static int _Match(FARunner runner)
{
var match = runner.NextMatch();
if (match.IsSuccess &&
runner.NextMatch().SymbolId ==
FAMatch.EndOfInput)
{
return match.SymbolId;
}
return -1;
}
This routine takes any kind of runner and validates a field with it. It checks the first token for a valid input, and then the next token must be the end of input symbol to be valid. Finally, it returns the symbol id that was matched so we can use it to discern which type of data is in the field.
Next is our test routine where we run all the passed in strings:
static void _RunStrings(FAStringRunner runner, string[] strings)
{
for (int i = 0; i < strings.Length; i++)
{
runner.Set(strings[i]);
Console.Write("\"");
Console.Write(strings[i]);
Console.Write("\" is ");
switch(_Match(runner))
{
case 0: Console.WriteLine("an email address"); break;
case 1: Console.WriteLine("a phone number"); break;
default:
Console.WriteLine("not a valid input");
break;
}
}
}
Here we take an FAStringRunner
and some strings. We can't take a plain old FARunner
because we need to be able to Set()
the input and that function isn't available on the base class.
For each string we Set()
the runner input, write out the string in quotes, and then the type of input it was. Note that we're calling _Match()
here to get the symbol of the validated match.
Finally, we kick it off with our entry point code:
var strings = new string[]
{
"foo",
"",
"baz@bar.com",
"(300) 555-1212"
};
var email = FA.Parse(@"(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|""(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*"")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])", 0);
var phone = FA.Parse(@"(?:\+[0-9]{1,2}[ ])?\(?[0-9]{3}\)?[ \.\-][0-9]{3}[ \.\-][0-9]{4}", 1);
var lexer = FA.ToLexer(new FA[] { email, phone });
var runner = new FAStringStateRunner(lexer);
_RunStrings(runner, strings);
Here we create state machines for email addresses and phone numbers from regular expressions. As you can see, the regular expression for email addresses is very complicated. Thank you, Internet. Note that we're using different accept ids for each one - zero and one, respectively.
Once we have those, we create a lexer from them, and then a state runner from that. We then pass that to _RunStrings()
.
Please forgive the atrocious wrapping on the email regex. Visual FA does not support multiline expressions so it was necessary.
You can find the Validation project in the solution at the GitHub link provided at the top of the article.
History
24th April, 2024 - Initial submission