Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Understanding Regular Expressions in .NET

0.00/5 (No votes)
4 Nov 2002 1  
I've created a Regex evaluator. It has proven to be extremely helpful. Please feel free to use, and e-mail me if you want the source.

The evaluator can be found here: RegexEvaluate.aspx

The evaluator was a learning process in itself, but well worth it. You'll want to do a little research in the SDK on .NET regular expression syntax. I needed the evaluator to help creat parsing expressions for SQL and HTML. Generating the correct expression would have been almost impossible without this tool to test with. I don't know how much people get into using regular expressions, but they are incredibly useful in a myriad of situations.

Pay attention to using grouping syntax like (?....). Makes a big difference.

The following is direct code I use to parse HTML. I'm a JScript.NET fiend so you'll have to bear with me. I wish I knew a built in .NET way to do this but it hasn't made itself known to me.

	class RegularExpressions {

		static function TagOpen(tagname:String) :String
			{ return '<\\s*(?<tagname>'+tagname+')\\s*(?(?:\\s*\\b\\w+\\b\\s*(?:=\\s*(?:"[^"]*"|\'[^\']*\'|[^"\'<> ]+)\\s*)?)*)/?\\s*>' }
		static function TagClose(tagname:String) :String
			{ return '<\\s*/\\s*(?<tagname>'+tagname+')\\s*>' }
		static function NameValue(name:String) :String
			{ return '(?<name>'+name+')(\\s*=\\s*("(?<value>[^"]*)"|\'(?<value>[^\']*)\'|(?<value>[^"\'<> ]+)))?' }
		static function MLtags(tagname:String) :Regex
			{ return new Regex( TagOpen(tagname)+"|"+TagClose(tagname), RegexOptions.IgnoreCase ) }
		static function MLopentags(tagname:String) :Regex
			{ return new Regex( TagOpen(tagname) ) }
		static function NVpair(name:String) :Regex
			{ return new Regex( NameValue(name), RegexOptions.IgnoreCase )  }
		static const HTMLtags:Regex = MLtags('\\w+')
		static const IMGtags:Regex = MLopentags('IMG')
		static const NameValuePairs:Regex = NVpair('\\w+')
		static const Email:Regex = new Regex( '(?:\w+[.]?)+@\w+(?:[.]\w+)+', RegexOptions.IgnoreCase)

	}
Sorry to leave this one as a puzzle, but you should be able to figure it out if you need it.

The SQL expressions and methods I created are much more complex and I would have a difficult time explaining it to myself now. But I would love for someone to call me an idiot for making these and show me a better way. The HTML parsing was necessary to break html into controls so that certain controls could be replaced with their programmatic counterparts (like an <img> tag). The SQL expressions were created to help in eliminating small differences in SQL statements like capitalization and spacing. And to break down the expression accurately to help in caching data / determining cached data.

I hope this helps...
--Oren

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here