(untagged)

Chapter 7 - Programming with XML

Erik Westermann

0.00/5 (No votes)

12 Oct 2002

Sample chapter from Learn XML In A Weekend by Erik Westermann

Title	Learn XML In A Weekend
Author	Erik Westermann
Publisher	Premier Press Books
Published	October 2002
ISBN	159200010X
Price	USD 29.99
Pages	400

Introduction

This article is an excerpt from Learn XML In a Weekend, specifically, Sunday Evening's lesson (chapter seven) called Programming with XML. I selected this chapter as the book's sample chapter becasue I feel that it represents the flow of the book's overall discussion and demonstrates the focus of each lesson.

This book contains seven lessons and other resources and information that are focused on only one thing: getting you up to speed with XML, its related technologies, and latest developments. The lessons are organized into a timeframe that spans a weekend beginning on Friday evening and ending on Sunday evening - and, yes, you can learn XML in a weekend.

I know what you're saying to yourself, with all of the other XML books that line the shelves of book stores and on-line stores, "What's so special about this book?". This book is different because it not only explains what XML is and how to use it, but also presents relevant, practical, and real-world uses of XML.

This book focuses on relevant XML technologies like XPath, XSD, DTD, and CSS, and explains why other technologies, like XDR, may not be important in certain scenarios. This book also takes a practical approach to working with XML. After learning the core syntax and other rules, I show you how to work with XML using two of the best XML editors on the market today: Excelon's Stylus Studio and Altova's XML Spy because there's not much point in writing XML documents, schemas, and transformations by hand if XML editors are capable of not only assisting you but also capable of generating a lot of XML for you.

I also discuss how to use XML in Internet Explorer, Microsoft Active Server Pages, and how to use XML with Microsoft's latest offering: the .NET Framework and the Visual C# .NET programming language.

This book is for you because it succinctly describes XML and its related technologies, focusing only on what's relevant in today's rapidly changing market place. I help you make choices that can mean the difference between a successful solution and one that fails because it uses irrelevant, incompatible, or out-dated standards.

Here's an overview of each lesson:

Friday Evening: A sort lesson focusing on introducing XML in terms of what it is, why it is useful, and how others use XML.
Saturday Morning: Focuses on using XML in Internet Explorer using HTML and XSL, and using XML with Microsoft's Active Server Pages. The intent of this lesson is to give you an overview of what you can do with XML - don't worry if you're not a programmer or don't understand the programming language that's used in the lesson. The idea is to expose you to these technologies so that you'll gain a better understanding of how others use XML.
Saturday Afternoon: This lesson focuses XML's core: how to write XML documents by following the rules that XML imposes. The lesson covers basic document structure, working with attributes, comments, and CDATA sections. The lesson also covers character encoding, which allows international users to read your XML documents, and namespaces - a feature that makes your XML documents more useful by allowing you to share them with others.
Saturday Evening: This is one of the longest lessons in the book focusing on document modeling using DTD and XSD. I suggest that you start reading this chapter as soon as you can after you complete Saturday afternoon's lesson so that you can complete the lesson in one evening.
Sunday Morning: The focus of this lesson is on using XML Spy and Stylus Studio to create and work with XML solutions. The lesson also covers XSL debugging using Stylus Studio - something that can save you hours of frustration when your XSL code does not work as you expect it should. This lesson also describes Microsoft's XML parser, called Microsoft XML Core Services, how to determine what version is installed on your system, and how to get the latest updates.
Sunday Afternoon: This lesson focuses on presenting data on the Web using presentation technologies like CSS and XSL. This lesson examines how to repurpose an XML document using XSL that you create using Stylus Studio's graphical XSL editor.
Sunday Evening: The focus of this lesson is to show you how to use XML with Internet Explorer's Data Source Object (DSO), the XML Document Object Model (XML DOM), and Microsoft's .NET Framework.
Appendix A: Presents an HTML and XPath reference.
Appendix B: Presents reference information on XML syntax and structure.
Glossary: Defines terms used throughout the book.

Sunday Evening - Programming with XML

You've come a long way in a very short time. Congratulations! With the essential aspects of XML behind you, you're now ready to get into working with XML. This lesson is made up of three major parts:

XML data binding with Internet Explorer.
Working with the XML DOM in Internet Explorer.
Using the Microsoft .NET Framework to work with XML.

The first part requires next to no programming experience at all, but the last two parts get into a fair amount of programming. You don't necessarily have to be a professional programmer to be able to understand the material in this lesson. A lot of people know just enough programming to write applications for themselves, while others are interested in programming to a degree but haven't had the opportunity to explore it on their own. If the latter describes your situation, this section will be of interest to you. Even if you're not at all experienced in programming, you should at least skim through this section to see XML from another perspective.

The samples in the last three major sections use the following programming languages: C# (pronounced "see-sharp"), JavaScript, and VBScript. You don't necessarily need to know any of those programming languages, but some programming experience will help you understand the samples more easily. The discussion of the .NET Framework explains what it is and how it uses XML.

XML Data Binding with Internet Explorer

Data binding is a general term for connecting a data source with a control that's capable of displaying data. Data source describes anything that acts as a source of data, such as a database or an XML document. Control is another general term for a visual element that's used to lay out information on the screen, like a table.

Internet Explorer can bind an HTML table to an XML data source, without programming, so you can easily display formatted XML data within a table on an HTML page. This is made possible by something called a Data Source Object (DSO). The DSO doesn't require any programming and is very easy to work with. It often satisfies most people's requirements. The only drawback to using the DSO is that it's available only with Internet Explorer version 4.0 and later.

You're limited to working with an HTML table to display data that you acquire through the DSO, and the DSO is capable of working with relatively simple data structures. The first example, shown in Figure 7.1, demonstrates how to use the DSO to generate a table that contains a listing of books, authors, and each book's Library of Congress (LOC) classification.

Figure 7.1- Displaying XML data in an HTML table.

The Library of Congress (LOC) Classification System's class and subclass information is shown in the last two columns of the table. For more information on the Library of Congress Classification System, visit their Web site at http://www.loc.gov/.

The listing is based on information in the books.xml document, located in the \XMLInAWeekend\chapter07 folder. The page itself is called dso1.htm. It's made up of HTML code, and also includes tags that describe where the data in the table comes from and which cells of the table contain the data. The following listing presents most of the page, with the relevant sections shown in bold:

<html>
  <head>
    <title>Data Source Object Demo #1</title>
  </head>
  <body>
    <XML ID="xmldata" src="books.xml"></XML>
    <table datasrc="#xmldata" id="ListOfBooks" 
      width="80%" align="center" cellpadding="0" 
      cellspacing="0" border="1">
      <tr>
        <thead>
          <tr>
            <th>Title</th>
            <th>Author</th>
            <th>Class</th>
            <th>Subclass</th>
        </thead>
        <td valign="top"><div datafld="title"></div></td>
        <td valign="top"><div datafld="author"></div></td>
        <td valign="top"><div datafld="loc_class"></div></td>
        <td valign="top"><div datafld="loc_subclass"></div></td>
      </tr>
    </table>
    <br>
    <hr>
  </body>
</html>

Appendix A contains an HTML reference that describes all of the elements that make up a table and demonstrates how create a table, so I won't repeat how to do that here.

The first bold part of the listing is the xml element. This xml element is very different from the xml element you've seen in XML document declarations. This one delimits what's referred to as an XML data island, which allows you to have an XML document either reside directly within an HTML document or refer to an XML document. The ID attribute is required because it's used to assign a name of the data island. Later you'll use the name of the data island to bind an HTML table to it. Instead of having the XML document appear inline with the rest of the HTML, this example refers to the XML document that resides in a separate file, as the src attribute indicates. Note that the xml element must have a closing tag.

An HTML table appears immediately following the XML data island declaration. Most of the table uses common attributes, except for the datasrc and ID attributes. The datasrc attribute associates the table with the XML data island by name. Note that the name is prefixed with the pound (#) sign. The table's ID attribute assigns a name to the table. This is optional, but it's a good practice to provide a name because you may wish to take advantage of the additional features that are demonstrated in the next example.

The names of the XML elements that you want to have in certain cells appear later in the table's declaration. Each field you want to have in the table must reside in an HTML div element, as shown in the preceding listing. The value of the datafld attribute must exactly match the name of the XML element's value you want to have in the cell. You can also specify the names of attributes in the datafld attribute. The DSO is capable of matching XML attribute names and generally can handle rather complex XML documents. However, the DSO doesn't seem to like attributes in the root element's declaration.

You can open the page directly in Internet Explorer to see the final result. Depending on how fast your system is, you may be able to see the table being formatted as Internet Explorer reads through the XML document and populates the table with data. The DSO reads XML data asynchronously, meaning that Internet Explorer picks up new data from the DSO whenever it's ready, while the DSO continues to read the data as quickly as it can. This feature boosts overall performance and makes users feel that Internet Explorer remains responsive, even when it loads larger data sets.

While the results are interesting, they can be better. There's a lot of information on the screen, and scrolling up and down can become tedious. The DSO provides a means of paging through a dataset using a few buttons and a tiny bit of coding.

Paging Through Long Data Sets

The DSO provides paging support for longer data sets. This is the primary benefit for users because they can easily navigate through a dataset on a page-by-page basis (see Figure 7.2).

Figure 7.2 - Navigational controls for paging through a long data set.

The sample file is called dso2.htm and is also located in the \XMLInAWeekend\chapter07 folder. Open it directly in Internet Explorer and try the navigation buttons.

Another advantage a paged data set provides is responsiveness. I mentioned earlier that the DSO and Internet Explorer work asynchronously to acquire and present data. When you first load dso2.htm, the table appears to be populated as soon as you open the page in Internet Explorer. As you play with the navigational controls or just look at the information on the screen, the DSO and Internet Explorer are hard at work, continuing to populate the table with information from the XML document. As a result, Internet Explorer continues to feel responsive even though it's working to keep the table completely populated.

The dso2.htm file is almost identical to dso1.htm, with a few minor changes, so I'll just focus on the new parts of the document. The following listing begins just after the opening HTML body tag and ends just after the first HTML tr tag of the table:

<body>
  <XML ID="xmldata" src="books.xml"></XML>
  <center>

    <input type="button" value="FIRST Page" 
      onclick="ListOfBooks.firstPage();"> 

    <input type="button" value="&lt;&lt; Previous" 
      onclick="ListOfBooks.previousPage();">

    <input type="button" value="Next &gt;&gt;" 
      onclick="ListOfBooks.nextPage();"> 

    <input type="button" value="LAST Page" 
      onclick="ListOfBooks.lastPage();">

  </center>
  <hr>
  <br>

  <table datasrc="#xmldata" datapagesize="5" id="ListOfBooks"
      border="1" width="80%" align="center" 
      cellpadding="0" cellspacing="0" >

    <tr>

The listing begins just after the HTML body element with the now familiar XML element, which is identical to the one you saw before. Immediately following that declaration is a HTML center element, which centers everything that resides within the beginning and ending tags.

There are four buttons on the page, as described by the HTML input elements. All of the input elements' type attributes cause them to be rendered as buttons, while the value attribute declares the caption that appears in the button. The value attribute uses predefined HTML entity references for the less-than symbol (< - <) and greater-than symbol (> - >) to avoid confusion with an HTML tag, because there are a lot of HTML tags close by.

The values of the onclick attributes are the key factors in providing navigational support to the user. Each onclick attribute's name comes from the event that the attribute captures. Internet Explorer evaluates the value of the onclick attribute when the user clicks one of the buttons. The value of the attribute uses the name of the table that displays the data, as shown in the table's ID attribute (ListOfBooks). This followed by a dot, followed by the name page you want to navigate to. There are four named pages that you can navigate to: firstPage(), lastPage(), nextPage(), and previousPage(). The spelling, case, and parentheses that follow the name of each page are important and must appear exactly as shown.

The table element has one new attribute, datapagesize, which describes how many rows you want the table to display on a given page. Specify the number of items per page using a numeric value, optionally in quotes, as shown in the preceding listing.

While paging support is helpful, you can go a step further by allowing users to define how many items they want to show per page.

Dynamically Changing the Number of Items Per Page

Allowing your users to change how many items appear per page increases the usability of the solution, because it helps to accommodate users with smaller or lower-resolution screens that can't fit as much information. This option also makes the solution more interactive by allowing users to customize the display to suit their preference.

This solution builds on the last one by adding two more HTML input elements and some simple JavaScript code. The input elements capture the user's requested number of items per page, and the JavaScript code dynamically reconfigures the DSO and resets the display after each change. The overall effect is shown in Figure 7.3.

Figure 7.3 - Allowing users to change the number of items per page.

To get a feel for how this feature works, use Internet Explorer to open the dso3.htm file, located in the \XMLInAWeekend\chapter07 folder. Navigate to another page in the data set and then change the number of pages. The number of items changes and the page resets to the first page of the data set. Changing the number of items per page can make it difficult to figure out where you are in the data set, but resetting the display to the first page will minimize that effect.

The first change to the page occurs with the introduction of two more HTML input elements, as shown by the bold lines in the following listing:

<center>

  <input type="button" value="FIRST Page" 
    onclick="ListOfBooks.firstPage();"> 

  <input type="button" value="<< Previous" 
    onclick="ListOfBooks.previousPage();">

  <input type="button" value="Next >>" 
    onclick="ListOfBooks.nextPage();"> 

  <input type="button" value="LAST Page" 
    onclick="ListOfBooks.lastPage();">

  <input type="text" maxlength="2" size="2" id="itemsPerPage">
  <input type="button" value="change" onclick="changePageSize();">

</center>

The first new input element is a text input field that captures the user's preferred number of items per page. The critical part of the declaration is the ID attribute that assigns a name to the field (itemsPerPage). The second input element is a button that the user clicks to carry out the requested change. Like other buttons on the page, this button has an onclick attribute. This time, the attribute refers to the name of a JavaScript function that applies the change. You can use any name you like when you create your own pages, but be sure to use a descriptive name that suggests what the JavaScript function does.

The page is not yet complete, because the JavaScript code that handles the new button's click event is not part of the page. JavaScript code typically resides within a script element, which in turn resides between the HTML page's head starting and ending tags (which appear just after the html element, but before the body element), as shown in the following listing:

<html>
  <head>
    <title>Data Source Object Demo #3</title>

    <script language="JavaScript">

    function Initialize()
    {
      itemsPerPage.value = ListOfBooks.dataPageSize;
    }
    
    function changePageSize()
    {
      ListOfBooks.dataPageSize = itemsPerPage.value;
      ListOfBooks.firstPage();
    }

    </script>

  </head>

  <body onload="Initialize();">

    <XML ID="xmldata" src="loc_classes.xml"></XML>
    <!-- rest of page... -->

There's a lot going on in this listing, so let's start in somewhat familiar territory with the body tag, near the end of the listing. The body tag has an attribute called onload, which you saw in Saturday morning's lesson. Internet Explorer evaluates the value of the onload attribute when it has finished loading the document but hasn't displayed it yet. As a result, the onload attribute usually refers to the name of a function that carries out initialization tasks, which get the page ready for display.

JavaScript is a popular programming language that most Web developers are at least familiar with. It's so popular because it's easy to use and is supported by major browsers, including Internet Explorer. JavaScript code on an HTML page must appear within a script element whose declaration includes the name of the programming language used by the code within it. Internet Explorer is adept at figuring out which programming language appears within a script element, making the value of the language attribute optional. However, it's good practice to specify the name of the programming language you're using. For Internet Explorer, valid settings for the script element's language attribute are JavaScript, JScript, and VBScript.

The body element's onload attribute refers to a function called Initialize() that appears immediately after the script element's starting tag. You can place this function wherever you want, as long as it appears within the script element. Generally, it's good practice to put a function like this immediately after the starting script tag, or just before the ending script tag.

The Initialize() function sets the value that appears in the itemsPerPage input field so it's equal to the number of items per page that the table (ListOfBooks) displays, to let the user know how many items per page the table starts with. The single line of code in the function refers to the name of the input field, itemsPerPage, followed by the field's property that holds the value you see on the screen (value), followed by an equals sign. So far, the statement says, "Assign a value to the itemsPerPage field." The code that follows the equals sign describes what to assign to the input field: the value of the number of pages that the table displays.

The next function, changePageSize(), handles the onclick event for the button that's next to the input field. The first line of the function assigns the value in the itemsPerPage input field to the ListOfBooks dataPageSize attribute. The line that follows resets the table's page back to the first page of the data set.

Keep the following in mind as you design your own pages that use the DSO:

Root XML elements cannot have attributes.
XML documents can be reasonably complex.
The DSO is capable of working with attributes. Simply refer to them by name.
Assign names to all buttons, tables, and text input fields using the ID attribute.
When creating a paged data set, assign a value to the dataPageSize attribute of the table that displays the data set for best performance.

The next example uses VBScript and the XML DOM to explore the structure of an XML document and write its contents to the screen.

Exploring an XML Document Using the XML DOM

The XML DOM (Document Object Model) provides a means of programmatically creating and manipulating XML documents. The DOM exposes (presents) an XML document by reading it into the computer's memory and then representing each aspect of it as a distinct object. An object is a logical element (meaning that it doesn't really exist) that represents something like a part of an XML document, a string, or even a file. An object not only refers to something, it also offers a means of manipulating it by exposing functionality that's appropriate for the object. For example, an object that represents a string of characters (simply referred to as a string) exposes functionality that allows programmers to compare, add, and manipulate strings. An object that represents part of an XML document may expose functionality that allows you to navigate to another part of the document, transform it using XSL, or add new information to it.

While an XML document essentially conveys information about the structure and content of the document using elements, an XML DOM represents an XML document as a collection of node objects. XML documents have several types of elements, so the XML DOM has several types of nodes, as shown in Table 7.1.

Table 7.1 Types of XML Nodes

Type	Description
document	Represents the document starting at the XML declaration. The document has only one child node (an element) that represents the document's root element.
documentFragment	Represents a fragment of an XML document.
element	Represents an XML element.
text	Represents the (text) content of an element and an attribute.
attribute	Represents an element's attribute..
cdatasection	Represents an element's CDATA data.
processinginstruction	Represents a processing instruction.
comment	Represents a comment.
entity	Represents an entity, as described by the document's DTD.
entityreference	Represents an entity reference, as described by the document's DTD.
documenttype	Represents the document's DTD.
notation	Represents a node notation in the document's DTD.

Each node in the XML DOM exposes a property called nodeType that describes what type of node it is, enabling you to process nodes based on their type. The example in this section processes attributes and elements based on the value in an XML document's nodeType property, as you'll see shortly.

One of the interesting features of the XML DOM is that it allows you to traverse an entire document without actually referring to nodes by name, very much like you saw in the previous lesson using XPath expressions. The example for this lesson is an HTML page containing VBScript code that explores the structure of an XML document using the XML DOM, and then writes out the structure and data onto the screen. Figure 7.4 shows what the output of the page looks like.

Figure 7.4 - Exploring an XML document using the XML DOM.

The sample is in the XML-DOM.htm file located in the \XMLInAWeekend\chapter07 folder. If you extracted the contents of the book's accompanying ZIP file, as described in the Preface, you're set to go. Otherwise, you'll have to edit the page to change the location of the XML document the sample relies on. The sample uses the carparts.xml document from the previous lesson, because you're probably familiar with its structure and content by now. Open the document to view its results. There's a chance that Internet Explorer will ask if you want to continue to let the page run. Answer Yes to allow the code in the page to continue processing the complete document.

The page is made up of about 95% VBScript code, with the remaining 5% representing supporting HTML code, so I won't describe the HTML code here.

The code is made up of three functions, all written in VBScript. The Initialize function acts as the starting point and gets called when you first load the page. Its role is to load the XML document into the DOM and start the processing by calling the walk_elements function. The walk_elements function traverses elements in the XML DOM by recursively calling itself to display the contents of each element. When walk_elements encounters an element with attributes, it calls the walk_attributes function, which displays the contents of all attributes for an element. The code isn't too complex, except for the part where walk_elements calls itself. When a function calls itself, it's said to be recursive. Recursive programming is an efficient means of processing many types of logical structures. Rather than present all 95 lines that make up the code on the page in one chunk, I'll walk you through the code as it executes when it processes an XML document.

The very first thing that happens when the page loads is that the following line executes:

indent = 0

This line creates a numeric, global variable that all code on the page can access, and sets its initial value to 0. This variable controls a key factor in the page's layout: the indentation of each line. Next, the now-familiar Initialize function is called as a result of the HTML document's body element onload attribute setting. The following listing presents most of the Initialize function:

Function Initialize()
  Dim root
  Dim xmlDoc
  Dim child
  Dim indent

  indent=0
  Set xmlDoc = CreateObject("Msxml.DOMDocument")
  xmlDoc.async = False
  xmlDoc.validateOnParse=False

  xmlDoc.load("\XMLInAWeekend\chapter06\carparts.xml")

  If xmlDoc.parseError.errorcode = 0 Then
    Document.Write("

")
    walk_elements(xmlDoc)
    Document.Write("

")
  Else
    ' code omitted for brevity...
  End If
End Function

After the initial Dim statements that declare some local variables, the function begins by creating an instance of the XML DOM object using the VBScript CreateObject function. The XML DOM exposes some properties that control its behavior when working with XML documents. The two lines that follow set the DOM object's async and validateOnParse properties to False, essentially disabling those two options (loading the document asynchronously and validating the document as it loads). The If statement determines if there were any errors loading the document by evaluating the value of the DOM object's errorCode property. If there weren't any errors, the function calls walk_elements to begin processing the document.

The walk_elements function takes a single parameter: a node. When Initialize calls walk_elements the first time, it passes the function the instance of the XML DOM, which is essentially a special type of node that represents the entire XML document. (See Table 7.1 for the types of nodes.) The function begins by initializing a For loop that executes once for each child node, as shown in the following listing:

function walk_elements(node)
  dim nodeName
  dim count
  count = 1
  indent=indent+2

  For Each child In node.childNodes
    For i = 1 To indent
        Document.Write(" ")
    Next

The next thing it does is indent the line it's about to write out by writing out several non-breaking space characters using the predefined HTML entity reference. The next block of code writes out the node's type and its name, as shown in the following listing:

    Document.Write("+--")
    Document.Write("<u>" & child.nodeTypeString & "</u>: ")
    If child.nodeType < 3 Then
      Document.Write "<b><" & child.nodeName _ 
        & "></b>  [#" & count & "]<br>")
      count = count + 1
    End If

The child.nodeTypeString represents the node's type as a string value, as shown in the Type column in Table 7.1. The child.nodeType property represents the node type as a numeric value. An element's nodeType value is 1, and an attribute's nodeType value is 2. As a result, the If statement ensures that only elements and attributes are shown on the screen.

The next block of code checks the current node to determine if it's an element, and then checks it for any attributes:

    If (child.nodeType = 1) Then
      If (child.attributes.length > 0) Then
        indent=indent+1
        walk_attributes(child)
        indent=indent-1
      End If
    End If

If the node has any attributes, as determined by the value of the attributes.length property, the function calls the walk_attributes function, passing it the current node to generate a list of attributes. The walk_attributes function is discussed shortly. The last actions that the walk_elements function takes are to call itself to process any child nodes and manage the value of the indent variable by decreasing its value:

    If (child.hasChildNodes) Then
      walk_elements(child)
    Else
      Document.Write  child.text & "

"
    End If
  Next
  indent=indent-2
End Function

The walk_attributes function is a lot simpler than the walk_elements function because attributes use a simple structure made up a of a name-value pair, as described in Saturday afternoon's lesson. The walk_attributes function is shown in the following listing:

Function walk_attributes(node)
  For Each attrib In node.attributes
    For i=1 to indent
      Document.Write(" ")
    next
    Document.Write("o--" & attrib.nodeTypeString & "")
    Document.Write(": " & attrib.name & " -- {" _
      & attrib.nodeValue & "}

")
  Next
End Function

This function behaves in a similar way to the walk_elements function. It indents each line by the number of spaces described by the value of the indent variable, writes out the string representation of the nodeType property, and writes out the attribute's value. The function's code resides within a For loop that executes once for each attribute.

You can try this sample with other, relatively short XML documents by changing the value in the parameter to the xmlDoc.load call in the Initialize function.

The Microsoft .NET Framework is a set of technologies that provide a unified programming model for traditional Windows-based and Web-based applications, based on a comprehensive class library that exposes system functionality. The .NET Framework uses XML in many ways, including transferring information between systems. The next section provides a brief introduction to the .NET Framework and demonstrates an application that uses the C# programming language.

XML and the .NET Framework

The .NET Framework is made up of three key elements:

Common Language Runtime
.NET Class Library
Unifying Components

The Common Language Runtime is a logical layer that separates an application from the platform it executes on, providing execution services such as memory management, error handling, and thread management. It abstracts the details of the operating system, processor architecture, and interface between it and a specific programming language. This makes it easier for developers to create applications, and for applications to work with one another. One of the key benefits of the Common Language Runtime is that it supports a variety of programming languages, including Visual C++ .NET, Visual Basic .NET, JScript .NET, and Visual C# .NET.

The .NET Class Library provides a consistent programming model because its functionality is accessible through all programming languages supported by the .NET Framework. It allows developers to easily access system functionality such as file and database access, advanced drawing support, input and output operations, and interoperability features, such as data interchange between networked systems.

The .NET Framework's functionality is exposed through a set of unifying components that include ASP.NET, Windows Forms, and Visual Studio .NET. ASP.NET is the next generation of Microsoft's Active Server Pages (ASP), with support for a programming model that's familiar to developers who create traditional Windows-based applications. ASP.NET also supports Web Services, a new way of exposing services that are designed to be used by people through applications and Web sites. Windows Forms is a unified means of creating traditional Windows applications that have a graphical user interface across all supported programming languages. Visual Studio .NET is a development environment that's tightly integrated with the .NET Framework, making it a great tool for developing and deploying network-centric and network-aware applications and services.

The .NET Framework uses XML throughout, for everything from configuration files to native support for XML in ADO.NET (a data access technology) to interchanging information with other systems. The .NET Class Library provides many classes that work with XML, making it easier to create applications that are XML-aware.

The example for this discussion is a traditional Windows-based application, written in Visual C# .NET, that reads an XML document and displays it in a Windows Forms TreeView control, as shown in Figure 7.5.

Figure 7.5 - A Windows Forms application showing the contents of an XML document.

Your system must have the .NET Framework installed on it to compile and work with this application. The .NET Framework is available for free from Microsoft's MSDN Web site at http://msdn.microsoft.com. The system requirements are posted there as well. Please review them before you download the .NET Framework, because it's a large download. If you already have the .NET Framework installed but don't have Visual Studio .NET, I've provided a compiled version of the sample code along with the book's source code distribution. You should be able to just start the application to try it out. The name of the application is xmlTreeView, and it's located in the \XMLInAWeekend\chapter07\dotNET folder. The name of the application's executable is xmlTreeView.exe.

The application's controls are straightforward. Click on the button to open a file selection dialog box and initiate reading the XML document you want to view. Use the Expand Tree check box before you select a file to have its contents expanded in the tree view automatically when the file is loaded. Expanding all elements automatically can take a long time if the document is large or has a complex structure, so use this feature with caution. One feature of the application that's not readily apparent is that you can drag the left and bottom borders of the window, which drags the left and bottom edges of the tree view control along with it. This makes it possible to view longer or wider XML documents in the tree view control without having to scroll sideways or up and down.

Unlike the pervious example, which uses the XML DOM to explore the structure and content of an XML document, this example uses a forward-only, read-only stream of data to quickly read through the XML document. (If you're familiar with XML processor programming models, this programming model is similar to the one offered by SAX, the Simple API for XML. The difference is that the parser doesn't raise events as it reads the XML document. Instead, the application informs the parser when to read the next block of information from the XML document.) The application displays the document in a System.Windows.Forms.TreeView control, which makes it easy for users to inspect the document using a familiar interface while using relatively little space on the screen.

The code that manages the display is rather involved and is beyond the scope of this book, so I won't cover it here. I'll describe most of the code as it would execute when you run the application. The code is a lot simpler than the previous example, because its focus is on reading the XML document and adding information to the tree view control. The .NET Class Library handles the rest of the details.

The application begins by presenting what's referred to as a file dialog box that allows the user to select which file to view. The following listing shows how the application creates an instance of the dialog and works with it:

OpenFileDialog oFileDlg = new OpenFileDialog();
oFileDlg.Filter = "XML files (*.xml;*.xsl)|*.xml;*.xsl|All files (*.*)|*.*";
oFileDlg.FilterIndex = 1;
oFileDlg.RestoreDirectory = true;
populateTreeView(oFileDlg.FileName);

Visual C# .NET is similar to C++, making it very easy to learn. The code in the preceding listing simply captures the name of the file that the user wants to view and passes it onto another function called populateTreeView, which is where the core functionality of the application resides.

The role of the populateTreeView function is to read XML data from the XML document and transfer that information to the TreeView control for display. The function uses a System.Xml.XmlTextReader object that provides fast, forward-only access to an XML document without validating it as it reads the document. The first thing the function does is open the XML document based on a System.IO.FileStream object, as shown in the following listing:

XmlTextReader xmlReader;
// strFile contains the name of the file the user wants to open

fileStreamObject = new FileStream(strFile, FileMode.Open, FileAccess.Read);
xmlReader = new XmlTextReader(fileStreamObject);

The code reads through the XML document using a loop that continues as long as XmlTextReader is able to successfully read information from the XML document. The application uses a switch statement to determine what type of node it's working with, because some nodes, like comments, CDATA sections, and processing instructions, are displayed a little differently than element nodes. The following listing presents the part of the application that handles element nodes:

switch(xmlReader.NodeType)
{
  // code omitted for brevity...

  case XmlNodeType.Element:

    xmlNode = new TreeNode("<" + xmlReader.Name + ">");
    emptyElement = xmlReader.IsEmptyElement;

    while(xmlReader.MoveToNextAttribute())
    {
      TreeNode attNode = new TreeNode("Attribute");
      attNode.Nodes.Add(xmlReader.Name + "='" + xmlReader.Value + "'");
      xmlNode.Nodes.Add(attNode);
    }
  continue;
  // code omitted for brevity...

}
xmlTree.Nodes.Add(xmlNode);

This application creates an instance of a TreeNode object that contains the name of the element, along with its attributes. Attributes are added as children of the element's node to take advantage of the TreeView control's display capabilities. Just before moving onto the next node, the code adds the TreeNode object it created earlier to the TreeView control (the last line of the listing).

This concludes the brief tour of the .NET Framework and Windows Forms applications. If you're interested in learning more about the .NET Framework or the various programming languages used in this lesson, refer to the Resources section of the book for some Web site addresses of interest, or visit my Web site (http://www.designs2solutions.com). I have the same resources, but I keep the links up to date in case they change.

Summary

You've come a long way this weekend, and now you're ready to use your newly acquired understanding of XML and its related technologies. XML technologies are rapidly changing to meet the changing needs of industry. You should try to keep up with the latest developments by regularly checking the World Wide Web Consortium's Web site (http://www.w3c.org) and perhaps subscribing to some of the excellent newsletters that are available from the major XML portal sites. XML is a great technology that's working its way into all facets of the computing industry, and it reflects the industry's drive to address some of its long-standing problems using an open, publicly available standard.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here