Introduction
Since the inception of XML, many developers have wondered why we need XML... How is it better than HTML and what does it do? For starters, XML is far more powerful than HTML, and the power resides in the "X" in XML (which stands for extensible). Rather than providing a set of pre-defined tags (as in the case of HTML), XML specifies the standards with which you can define your own markup languages with their own sets of tags. XML is therefore a meta-markup language, allowing you to define an infinite number of markup languages based upon the standards defined by XML.
XML was created so that richly structured documents could be used over the web. The only viable alternatives, HTML and SGML, are not practical for this purpose. XML allows you to define all sorts of tags with all sorts of rules, such as tags representing business rules or tags representing data description or data relationships.
In this article, we're going to take a look at some of the terminology that comes with using XML and its related technologies, as well as how to create and transform XML documents with XSL using Microsoft's MSXML parser. To test the code samples shown in this article, you should be running Windows NT/2000/XP with IIS installed. You should also have SQL server 2000 installed on the same machine.
XML definitions
As with any technology, XML has its own acronym-riddled lingo. Some of the important acronyms include:
- DTD: In XML, the definition of a valid markup is handled by a Document Type Definition (DTD), which communicates the structure of the markup language. The DTD specifies the validity of each tag.
- XSL: The Extensible Style Language (XSL) is the style language for XML that allows us to transform XML nodes using a set of patterns and templates.
- XML Pointer Language (XPointer) and XML Linking Language (XLink): These two technologies define a standard way to represent links between resources. In addition to simple links like HTML's
<a>
tag, XML has mechanisms for linking between multiple resources and linking between read-only resources. XPointer describes how to address a resource whereas XLink describes how to associate two or more resources.
- XML Flow Architecture: XML offers a three-tier architecture. It can be generated from existing databases that employ a 3-tier model themselves. We can maintain business rules separately.
Why XML should be used?
Using XML provides us as developers with a number of benefits. Some of the most obvious benefits include:
- Authors and providers can design their documents using XML, instead of being stuck with HTML. They can be explicitly tailored for an audience, so the cumbersome problems with HTML are theoretically eliminated; therefore both authors and designers are free to invent their own markup elements.
- Information can be richer and is easier to access and manipulate because the hypertext linking abilities of XML are much more advanced than those found in HTML.
- XML can provide more (and improved) facilities for browser presentation and performance.
XML compresses exceedingly well. Since data compression algorithms operate on the concept of maximizing the entropy of a given input stream, it stands to reason that a highly ordered input stream consisting of regular, repeating tag sequences will compress exceedingly well... much better than standard text which contains generally far less order, thus resulting in a decrease in performance.
Weaknesses of XML
XML is obviously not a cure-all language free of any disadvantages... otherwise we would be using XML to markup/represent all of our data, and nothing else! There are of course some drawbacks and weaknesses of XML, namely:
- XML markup can be incredibly verbose, depending on the vocabulary in question.
- All the pieces of the XML puzzle aren't yet in place, certainly not from a standards-compliant viewpoint anyhow. We've got both XSL and XSLT, however they are not fully developed yet.
- There are still some problems with Microsoft's XML parser.
- XML Hypertext Transfer Protocol (XML-HTTP) still has some minute problems.
Performance of XML
When you're designing an XML-based web application, what kind of performance hit do you expect to put on your web server? It's hard to generalize because there are so many variables (such as the size of the XML document, the amount of script code required to process the document, the amount of output generated, etc.) to take into consideration, however the following list shows the major variables that can affect the performance of parsing XML:
- The kind of XML data being parsed.
- The ratio of tags to text.
- The ratio of attributes to elements.
- The amount of discarded white space in the document.
XML and DOM
Microsoft has provided us with the MSXML parser, which exposes an XML document in the form of a DOM (Document Object Model). With the XML DOM, you can load and parse XML files, gather information about those files, navigate through and manipulate those files. To learn more about the details of the XML DOM, please refer to this site.
Now that we've discussed the reasons for using XML, it's time to look at some source code. We will examine some ASP scripts that create and display XML data. We're going to create an XML file using both static data and data from a database using ADO. The DOM methods createNode
and appendChild
, as well as the text
property are used to construct an in-memory XML tree.
XML with ASP
The following example illustrates how to create an XML tree (in memory) and then persist it to disk using the save
method:
<%
Dim xmldoc
Set xmldoc = Server.CreateObject("Microsoft.XMLDOM")
If (xmldoc.childNodes.length = 0) Then
Set root = xmldoc.createNode("element", "Hi-Tech", "")
xmldoc.appendChild (root)
Set onode = xmldoc.createNode("element", "Employee", "")
onode.Text = "Gurpreet Singh"
xmldoc.documentElement.appendChild (onode)
Set inode = xmldoc.createNode("element", "Address", "")
onode.appendChild (inode)
Set child = xmldoc.createNode("element", "Address1", "")
child.Text = "Nepean Ont"
inode.appendChild (child)
Set child = xmldoc.createNode("element", "Address2", "")
child.Text = "Canada"
inode.appendChild (child)
End If
xmldoc.save (Server.Mappath("savedI2.xml"))
%>
In the example above, we create an XMLDOM
object. We then create a root node and its child node using the createNode
function. Next, we append the nodes after assigning the text
property to each of them. Finally, we save the in-memory XML tree to a file, savedI2.xml.
We can also build an XML file from the results of a database query. I've included two files with the support material for this article: pubtest.asp and saved.xsl. Pubtest.asp connects to the SQL Server 2000 pubs
database, retrieving several records from the authors
table, formatting them as a new XML document and saving that document as saved.xml.
The saved.xsl file contains an XSL style sheet which is used by pubtest.asp to format saved.xml as HTML. You should download the support material before continuing.
Here's an extract from pubtest.asp:
Do While Not rs.EOF
Set onode = xmldoc.createNode("element", "Employee", "")
xmldoc.documentElement.appendChild (onode)
Set inode = xmldoc.createNode("element", "Name", "")
inode.Text = rs.fields(0) & " " & rs.fields(1)
onode.appendChild (inode)
Sql = "select title_id,royaltyper from titleauthor " & _
"where au_id = '" & rs.fields(2) & "'"
Set rs2 = conn.Execute(Sql)
If Not (rs2.EOF = True And rs2.bof = True) Then
Set inode = xmldoc.createNode("element", "Titles", "")
onode.appendChild (inode)
Set child = xmldoc.createNode("element", "TitleId", "")
child.Text = rs2.fields(0)
inode.appendChild (child)
Set child = xmldoc.createNode("element", "royalty", "")
child.Text = rs2.fields(1)
inode.appendChild (child)
rs2.Close
Set rs2 = Nothing
End If
rs.movenext
Loop
One of the XML elements that results from the ASP code shown above looks like this:
<Hi-Tech>
<Employee>
<Name>Bennet Abraham</Name>
<Titles>
<TitleId>BU1032</TitleId>
<royalty>60</royalty>
</Titles>
</Employee>
Here's an extract from the XSL style sheet file, saved.xsl:
<xsl:for-each select="Hi-Tech/Employee">
<TR>
<TD WIDTH="40%" BGCOLOR="lightyellow">
<FONT FACE="Verdana" SIZE="2" COLOR="black">
<xsl:value-of select="Name" />
</FONT>
</TD>
<TD WIDTH="30%" BGCOLOR="lightyellow">
<FONT FACE="Verdana" SIZE="2" COLOR="black">
<xsl:value-of select="Titles/TitleId" />
</FONT>
</TD>
<TD WIDTH="30%" BGCOLOR="lightyellow">
<FONT FACE="Verdana" SIZE="2" COLOR="black">
<xsl:value-of select="Titles/royalty" />
</FONT>
</TD>
</TR>
</xsl:for-each>
When I ran pubtest.asp in my browser, here's what it looked like:
Let's now run through the entire process of retrieving data from the pubs
database, saving it to an XML file, and loading and transforming this document as XSL with the MSXML parser.
Our XML example explained.
Firstly, we connect to SQL Server 2000 using a system DSN. The DSN is called pubs, and you should create the DSN using the Windows Control Panel. It should connect to your SQL server, more specifically to its pubs
database. We instantiate an ADO connection
object, passing in the DSN to its open
method:
Set conn = Server.CreateObject("ADODB.Connection")
dsn = "DSN=pubs;UID=sa;PWD="
conn.Open dsn
Once we're connected to our database, we create a new XML document and assign it a root element called Hi-Tech
. We then proceed to retrieve a recordset
from the authors
table of our pubs
database:
If (xmldoc.childNodes.length = 0) Then
Set root = xmldoc.createNode("element", "Hi-Tech", "")
xmldoc.appendChild (root)
Sql = "select au_lname,au_fname,au_id from authors"
Set rs = conn.Execute(Sql)
rs.MoveFirst
We then loop through each record in the recordset
, appending its title
, titleID
and royalty
fields to the XML document that we created earlier. We use the createNode
and appendChild
methods to do so:
Set inode = xmldoc.createNode("element", "Titles", "")
onode.appendChild (inode)
Set child = xmldoc.createNode("element", "TitleId", "")
child.Text = rs2.fields(0)
inode.appendChild (child)
Set child = xmldoc.createNode("element", "royalty", "")
child.Text = rs2.fields(1)
inode.appendChild (child)
Once we've retrieved each of the records from the recordset
and appended them to our XML document, we save the XML document to our local machine using MSXML's save
method:
xmldoc.save server.mappath("saved.xml")
We're now at the point where we have an XML file called saved.xml, as well as the style sheet that's included in the support material for this article, called saved.xsl. We instantiate a new XMLDOM object for each of these files, calling the transformNode
method of the XML DOM object with a reference to the XML DOM object that contains the XSL file:
sourceFile = Server.MapPath("saved.xml")
styleFile = Server.MapPath("saved.xsl")
set source = Server.CreateObject("Microsoft.XMLDOM")
source.async = false
source.load(sourceFile)
set style = Server.CreateObject("Microsoft.XMLDOM")
style.async = false
style.load(styleFile)
Lastly, we use Response.Write
to output the transformed XML to the browser:
Response.Write source.transformNode(style)
Conclusion
Well, I hope this article has answered some of your questions on XML. Hopefully you've learned a thing or two about the advantages and disadvantages of using XML, when it can be used, and most importantly, how it can be used.