Latest update (V1.1)
Fixed a bug in the ConvertToLowerCase
method that was not working with tag with attributes.
Introduction
This article presents a simple class that can be used to adjust the HTML code generated by ASP.NET in order to make it a valid XHTML document.
A valid XHTML document is a document that has been successfully tested using the W3C Markup Validation Service (see http://validator.w3.org). This free service checks XHTML documents for conformance to W3C recommendations. This is not only useful to guarantee that your site will be correctly managed by any W3C compliant browser, but this kind of compliance could also be a specific requirement coming from your customer.
The problem
The problem is that if you try to create a XHTML document using ASP.NET, you will probably fail since the code generated by the ASP.NET engine is not XHTML.
Just create a simple ASPX page and then run the W3C validator. Here is a list of errors you will find:
Uppercase tags
XHTML is all lower-case and it is case sensitive. Tags like HTML
or HEAD
are undefined for the XHTML validator. For this kind of problems, you could simply fix it by hand editing the HTML directly using the Visual Studio editor. Unfortunately, each time you add a new control on the page and you go back and forth from the design to the HTML view, the Visual Studio editor make the tags HTML
and HEAD
all uppercase.
Self-close tags
In XHTML (as in XML), all the tags must have a correspondent close tag or they must be self-close. Tags like <br>
or <link href="style.css" rel="stylesheet">
are not XHTML valid. You should use <br />
and <link href="style.css" rel="stylesheet" />
instead.
Deprecated attributes
Some valid HTML attributes have been deprecated by XHTML. For instance, the name
attribute is substitute by the id
. If you take a look at the ASP.NET HTML code, you will see the following script (that is actually used to handle the ASP.NET postback mechanism).
<form name="Form1" method="post" action="Index.aspx" id="Form1">
<input type="hidden" name="__EVENTTARGET" value="" ID="Hidden1"/>
<input type="hidden" name="__EVENTARGUMENT" value="" ID="Hidden2"/>
<input type="hidden" name="__VIEWSTATE"
value="ReuDDhCfGkeYOyM6Eg==" ID="Hidden3"/>
<script language="javascript">
function __doPostBack(eventTarget, eventArgument) {
var theform;
if (window.navigator.appName.toLowerCase().indexOf("netscape") > -1
{
theform = document.forms["Form1"];
}
else {
theform = document.Form1;
}
theform.__EVENTTARGET.value = eventTarget.split("$").join(":");
theform.__EVENTARGUMENT.value = eventArgument;
theform.submit();
}
</script>
The form
attribute name
need to be removed in order to make this code XHTML compliant.
Note that this code is generated only when the page is created. You have no way to change it at design time.
Mandatory attributes
The above script has another problem. In the script
tag, the type="text/javascript"
attribute is missing. This attribute is mandatory according to the XHTML specification.
Misplaced attributes
Still considering the content of the Form1
, the hidden input
tags are not correctly placed. In fact, according to XHTML specifications, an input
tag has to be inside one of the following tags: "p
", "h1
", "h2
", "h3
", "h4
", "h5
", "h6
", "div
", "pre
", "address
", "fieldset
", "ins
", "del
".
The solution
A possible solution is to intercept the HTML code just before it is sent to the client web browser and make the needed corrections.
XHTMLPage class
The XHTMLPage
class inherits from the System.Web.UI.Page
class, and it overrides the Render
method.
protected override void Render(HtmlTextWriter output)
{
StringWriter w;
w = new StringWriter();
HtmlTextWriter myoutput = new HtmlTextWriter(w);
base.Render(myoutput);
myoutput.Close();
m_sXHTML = w.GetStringBuilder().ToString();
ReplaceDocType();
switch (m_XHTMLFormat)
{
case _XHTMLFormat.XHTML10_Strict:
ConvertToXHTMLStrict();
break;
case _XHTMLFormat.XHTML10_Transitional:
ConvertToXHTMLTransactional();
break;
}
output.Write(m_sXHTML);
}
In the XHTMPage::Render
method, first of all, the base class method base.Render
is called using an instance of a new HtmlTextWriter
object that has been created locally. The HtmlTextWriter
is based on an underlying StringWriter
object; in this way, the HTML code generated by ASP.NET can be placed inside the m_sXHTML
string and then it can be treated.
The methods ConvertToXHTML�
take care of replacing the non-valid XHMTL parts with equivalent XHTML code.
Make your page XHTML valid
In order to make any ASP.NET page an XHTML valid page, you just need to inherit from XHTMLPage
instead of System.Web.UI.Page
.
public class Index : XHTMLPage
The XHTMLPage
can be configured using the XHTMLFormat
property; this can be set to Strict
or Transitional
(that is the default) in order to make the page valid according to the XHTML Strict or SHTML Transitional specification.
base.XHTMLFormat = XHTMLPage._XHTMLFormat.XHTML10_Strict;
Conclusion
Here I presented a problem that you may meet when trying to get a valid XHTML page using ASP.NET. Could be that this problem will be solved in the next version of Visual Studio, but in the mean time, I presented a simple solution you may find useful.
In the sample code I attached, I did not care too much about performance, but it is obvious that parsing the HTML generated by ASP.NET takes some time.
Credits
- sebmafate helped me in extending and fixing the class functionality.
History
04-Nov-2004
- Fixed a bug in the
ConvertToLowerCase
method that was not working with tag with attributes.
13-Oct-2004
- Automatically convert to lowercase all tags and attribute names.
- Added support for XHTML Frameset specification.
- Added support for encoding and language XML attributes.
- Added support for CDATA attributes.
27-Sep-2004