Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

ASP Parser

0.00/5 (No votes)
28 Apr 2004 1  
This article presents a simple way to parse and analyze ASP document structure.

Introduction

During my labor with ASP+, I developed a few design-time tools. I early encountered a problem that there was no way to analyze ASPX or ASCX document structure. Microsoft does not support something like ASP Document Object Model (similar to XML DOM). Therefore, I was forced to create my own ASP parser which analyzes the ASPX / ASCX document tags and creates a tree of objects that represent the structure of the document. The ASP DOM was very useful in my automatic localization mechanism but it could be used in many other cases.

Example

When creating a new ASPX document, it usually looks similar to this one:

<%@ Page language="c#" Codebehind="Empty.aspx.cs" 
   AutoEventWireup="false" Inherits="PersistingObjects.Empty" %>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" > 
<html>
  <head>
    <title>Empty</title>
    <meta name="GENERATOR" Content="Microsoft Visual Studio .NET 7.1">
    <meta name="CODE_LANGUAGE" Content="C#">
    <meta name=vs_defaultClientScript content="JavaScript">
    <meta name=vs_targetSchema 
      content="http://schemas.microsoft.com/intellisense/ie5">
  </head>
  <body MS_POSITIONING="GridLayout">
    <form id="Form1" method="post" runat="server">
    </form>
  </body>
</html>

After parsing the document with the presented ASP DOM, it can be displayed as a tree of objects, e.g.:

You can find here the parsed tree of tags that occur in the ASPX document. Each of the tags (including <OPEN_TAG>, </CLOSE_TAG>, <EMPTY_TAG/>, and tags that do not require closing tag, e.g.: <META>, <BR>) is represented by a tree node with a collection of attributes and child tags beneath it.

Details

To create the ASP Document, use the following code:

string asp = 
  "<asp:linkbutton id=\"LinkButton1\" runat="\""server\">text</asp:linkbutton>";
ASP.Document root = new ASP.Document(asp);

After the ASP.Document is created, it can be traversed recursively, e.g.:

public void Traverse(ASP.Tag tag)
{
    foreach(ASP.Tag child in tag.ChildTags)
    {
        Console.WriteLine(child.Value);

        foreach(ASP.Attribute attribute in child.Attributes)
        {
            Console.WriteLine(attribute.Key + "=" + attribute.Value);
        }

        Traverse(tag);
    }
}

As the ASP.Document is inherited from ASP.Tag, the Traverse method can be called like this:

Traverse(root)

Tags

The core object of the parsed ASPX/ASCX is ASP.Tag. There are a few types of tags: Root, Open, Close, Text, Directive, Codeand Comment. The type of the ASP.Tag can be checked with Tag.TagType property.

  1. Root tag - the only tag that has the type of Root is ASP.Document. This tag type does not contain attributes.
  2. Open tag - e.g.: <title>. This tag type can contain attributes.
  3. Close tag - e.g.: </title>. This tag type does not contain attributes.
  4. Text tag - each text that occurs between other tags, e.g.: (<title>TEXT</title>, the TEXT will occur as a Text tag in ASP.Document tree). This tag type does not contain attributes.
  5. Directive tag - each tag that begins with %@, e.g.: <%@ Page language= "c#" ... %>. This tag type can contain attributes.
  6. Codetag - each tag that begins and ends with % (except the Directive which begins with %@, and Comment which begins with %--), e.g.: <% this.DoSomething() %>. This tag type does not contain attributes.
  7. Comment tag - XML comment, e.g.: <!-- COMMENTED -->, or server side comment, e.g.: <%-- COMMENTED --%>. This tag type does not contain attributes.

Attributes

Each of the ASP.Tag objects has Tag.Attributes property that contains a list of attributes of the tag. The list is empty if the tag does not contain attributes or its type does not support attributes. Therefore only Open or Directivetag can contain non empty attributes list.

Each of attributes contains key (Attribute.Key) and value (Attribute.Value). The value can be empty string ("") if the attribute does not contain it.

Limitations

The main limitation I found in the presented ASP DOM is that it gives read only access to the parsed ASPX / ASCX document. It does not allow to modify the ASP.Document structure and save it back to a string.

Summary

This article does not describe all features of the presented ASP DOM but only briefly shows how to use it. I am sure that in most development cases, the ASP DOM is not needed, but I wanted to share what I gained to make your life easier. For those who will use it: enjoy it.

Source code attached to this article contains sample application that displays parsed ASPX document as a tree.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here