Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / ASP.NET

YADDU. Yet Another (MS-SQL) Database Documentation Utility

5.00/5 (2 votes)
17 Feb 2012CPOL11 min read 19K   324  
Outlines the construction of a simple data dictionary utility for MS-SQL using extended properties.

Introduction

In versions of MS-SQL prior to 2005 there was no programmable way of allowing the addition of additional information such as descriptions to tables or columns. Attempts to address this problem usually resulted in solutions that either added one or more information tables to each database or held the information in a completely separate store. In either case it is too easy for the additional information to get out of step with the parent DB. The introduction in MS-SQL 2005 of extended properties for database objects and the provision of stored procedures to manipulate them goes a long way to solving this problem.

This article presents some very simple classes to extract, format and update data dictionary information using extended procedures for MS-SQL databases together with a bare bones web UI making use of them. It is based on the recent reworking of a utility originally I wrote a little over 6 years ago. The original was restricted by having to work with SQL-7 and suffered from some of the problems noted above.

The main requirements that drove the development of the original version were to provide:

  • Provide a more readable summary of the information available from sp_help, preferably browser based
  • Allow the maintenance of comments/descriptions for table columns, ditto.

As presented this utility allows for the maintenance of the following information:

Table
Hidden
- Hides the table from the documentation utility.
Version
- The DB version that the table was introduced in
Comment
- A description of the table's purpose,
Column
Comment
- A description of the column's purpose,

Background

SQL Server Extended Properties

Introduced with SQL2K5 extended properties allow you to associate sql_variant values containing up to 7,500 bytes of data with objects in a MS-SQL server database.

Extended Properties have a number of advantages. Those that make them particularly suited to a data dictionary application are:

  • They are part of the database and are backed up and restored with the database.
  • Automatically dropped when the parent object (table, view, column) is dropped.
  • Can be viewed using SQL Server Management Studio

Extended Properties are maintained using three stored procedures:

  • sys.sp_addextendedproperty
  • sys.sp_updateextendedproperty
  • sys.sp_deleteextendedproperty

Unfortunately these stored procedures are a little "unfriendly" in their behaviour and it is necessary to check for the existence of the property you wish to add, update or delete before calling them.

information_schema and other views

These views allow us to examine DB structure without having to resort to the sysobjects table as was the case in earlier versions of MS SQL Server. For this piece we're interested in the following:

sys.extended_properties
- Lists all defined extended properties in the current DB.
sys.information_schema.tables
- Lists all tables and views in the current DB.
sys.information_schema.columns
- Lists column information for views and tables in the current DB.
sys.information_schema.table_constraints
- Lists constraints, such as primary keys, for columns and tables.
sys.views
- Lists views in the DB. Allows us to distinguish views from tables in information_schema.tables and to retrieve object IDs for table, view and column names.
sys.tables
- Lists tables in the DB. Serves the same purpose as sys.views.

XML in T-SQL

This is old hat now, but it allows us to generate a variety of output formats using some simple XSLT.

Using the Code

Class: Query

This has a number of static methods returning T-SQL scripts. The T-SQL is embedded in the application because we don't want to have to add anything to a DB in order for any application that we may write to run.

MethodScopeReturns
GetSchematapublic staticstring The T-SQL necessary to get schemata for one or more tables.
GetSingleTablepublic staticstring The T-SQL necessary to get a schema for a single table. Little more than a call forwarder to GetSchemata
ColumnListpublic staticstring The T-SQL necessary to get a list of columns for a single table.
UpdateColumnPropertypublic staticstring The T-SQL necessary to add or update an extended property for a table column.
UpdateTablePropertypublic staticstring The T-SQL necessary to add or update an extended property for a table or a view.

The scripts represented by these methods, especially GetSchemata, determine the extended properties that we create and update.

GetSchemata

For both the table and column properties we extract the extended properties from the sys.extended_properties view using a couple of derived tables. Should you want to maintain more extended properties then you'd start your modifications here. Were you so minded the number and name of extended properties available could be made configurable.

Table Properties
SQL
...
-- Pivot extended properties for the table.
left  join
    (select
      major_id,
      max(case when [name]='comment' then [value] else '' end) as comment,
      max(case when [name]='hidden'  then [value] else '0' end) as hidden,
      max(case when [name]='version' then [value] else '' end) as version
      from sys.extended_properties
      where class = 1 and minor_id = 0
      group by major_id
    ) as [tableProps]
  ...
Column Properties
SQL
...
 -- Pivot extended properties for the column.
 left  join
     (select
       major_id, minor_id,
       max(case when [name]='comment' then [value] else '' end) as comment,
       max(case when [name]='version' then [value] else '' end) as version
       from sys.extended_properties
       where class = 1
       group by major_id, minor_id
     ) as [columnProps]
 ...

It's worth noting that that columns are identified by the same major_id as their parent table and a minor_id. The minor_id is simply the column's index.

The output from GetSchemata is one or more rows of XML formatted fragments. These fragments are concatenated to give an XML fragment like that shown below. This allows us to use XSL to generate a number of different output formats.

XML
<Tables Name="lbs">
<table Name="Bike" IsView="0">
  <tableProps Comment="Description of a specific bike linking to owner and manufacturer."
              Hidden="0"
              Version="1.0.0">
    <column Name="ID" PrimaryKey="1" Type="uniqueidentifier" Length="16" AllowNull="0" DefaultValue="newid()">
      <columnProps Comment="Internal Primary Key GUID"
                   Version=""/>
    </column>
    <column Name="CustomerID" PrimaryKey="0" Type="uniqueidentifier" Length="16" AllowNull="1">
      <columnProps Comment="Owner ID. If bike not stock"
                   Version=""/>
    </column>

    :
    :
    :
    :

  </tableProps>
  </table>

  :
  :

</Tables>

UpdateColumnProperty, UpdateTableProperty

As far as I am aware there is no bulk update method for extended properties. A stored procedure call is required for each addition, removal or update. As a Data Dictionary utility is not likely to generate huge numbers of updates, assuming most tables rarely have more than a dozen or so columns and that a user can only update a single table at a time, this is unlikely to cause serious performance problems.

The T-SQL returned by these methods is straight-forward if ugly. A select to determine whether or not the property is already associated with the table or column directing a call to the add or update stored procedure as required

You will notice that the schema that owns the table is hard coded as 'dbo'. This is appropriate for the databases I tinker with at home. If you adopt this code then you may want to recover the schema name from the sys.schemas view.

Obtaining a table's schema name

SQL
select
  sys.schemas.name as [schemaName]
from sys.schemas
 inner join (select [name], schema_id from sys.tables
             union select [name], schema_id from sys.views)
             as systable
      on   systable.schema_id = sys.schemas.schema_id
      and  systable.name = 'Bike'

Class: XML

This simply wraps up the calls necessary to apply an XSL transform to an XML string.

MethodScopeReturnsComment
ApplyTransformpublic staticstringReturns a string containing the transform output.
ApplyTransformpublic staticvoidWrites the transform output to a file.

XSLT Templates

Five templates are included by way of demonstration.

FilenameOutputComment
ToHTML.xslGenerates a simple HTML page giving an sp_help like description of the table.Provides a hyperlink to an ASPX page where the user can update the table's extended properties.
ToDOC.xslA "document" format for use with Open Office Writer or MS Word.This is a cheat. The output is a simplified variation on the ToHTML template with a doctype tag which allows Writer or Word to open the "document" as formatted text. It saves a lot of hard work creating RTF and, as both Writer and MS Word allow documents to be saved in a variety of formats including PDF, it deals neatly with the problem of supplying read only copies of database documentation to third parties.
ToCSharp.xslA C# class providing access to the table name, its column names, the datalength and SQL data type of the each column.Less useful than in Framework 1.1 now that database object models are so easily generated, but still handy for working with DataTables and DataViews.
ToVB.xslA VB equivalent of ToCSharp.xslAs C# output.
ToSQL.xslA T-SQL script to add all defined extended properties to tables.Intended for bulk setting up of data on creation of a new version of a database where no extended properties have yet been set.

Each template follows the same extremely simple pattern:

XML
<!-- Match root -->
<xsl:template match="/">

  <!-- Insert/generate any document "header". -->

  <!-- Process each table in alphabetic order -->
  <xsl:apply-templates select="Tables/table">
    <xsl:sort select="Name"/>
  </xsl:apply-templates>

</xsl:template>

<!-- Deal with a single table -->
<xsl:template match="table">

  <!-- Lay out table specific information.  -->

  <!-- Process each column -->
  <xsl:apply-templates select="tableProps/column"/>

</xsl:template>

<xsl:template match="tableProps/column">
  <!-- Lay out a single column  -->
</xsl:template>

</xsl:stylesheet>

LINQ for XML - XElement Extension Class

The ASPX page that allows the user to update the table's extended properties is assembled on the fly using the output from Query::GetSingleTable and LINQ for XML. This proved quite straight-forward apart from a minor irritation; the XElement::Attribute method throws a null exception if there is no value for the named attribute. To keep the code easy(ish) to follow some XElement extension methods were found to be necessary.

MethodComment
AttributeStringReturns a string value for the named attribute. Empty string if no alternative default value supplied.
AttributeBoolReturns a boolean value for the named attribute. False if no alternative default value supplied.
AttributeValueReturns an object for the named attribute. A default must be supplied when the call is made.

UI Notes

The UI is very simple. It has only 4 main components

FilenameComment
default.aspxA login page.
sp_helpDisplays schemata for all selected tables. There is no aspx or other file associated with this page. It is created on request.
dataDic.aspxAllows the update of data dictionary information for a single table.
loadSchema.aspxNo visible elements. Accepts requests from default.aspx and generates the required output.

dataDic.aspx

This page takes the XML description for one table and using LINQ for XML creates aspx text controls for each extended property associated with the table. As written it expects to create controls and retrieve information for only the attributes given above. It should be relatively straight-forward to modify it to cope with a configurable list of extended properties.

The ASP generation code is very simple...

C#
var query = from column in info.Elements("table").Elements("tableProps").Elements("column")
            select column;

Panel ColumnPanel = new Panel();
ColumnPanel.ScrollBars = ScrollBars.Vertical;
ColumnPanel.Height = Unit.Pixel(400);

this.Panel1.Controls.Add(new LiteralControl(@"<hr />"));
this.Panel1.Controls.Add(ColumnPanel);

foreach(XElement column in query)
{
  // each column in its own 2 column div
  // Column name to the left, input field to the right.
  ColumnPanel.Controls.Add(new LiteralControl(@"<div class=""columnInfo"">"));

  // Get the name of the column and create a label using it.
  string columnName = column.AttributeString("Name", "!unnamed!");
  string columnComment = "";

  ColumnPanel.Controls.Add(new LiteralControl(@"<div class=""cellLeft"">"));
  ColumnPanel.Controls.Add(new LiteralControl(columnName));
  ColumnPanel.Controls.Add(new LiteralControl(@"</div>"));

  // Get the comment / description for the column.
  var qryText = from comment in column.Elements("columnProps")
                select comment;

  // Despite the loop there'll only be one comment.
  foreach(var comment in qryText)
  {
    columnComment = comment.AttributeString("Comment", "");
  }

  // Create a multiline input text area with a default text given by
  // any existing comment attribute.
  ColumnPanel.Controls.Add(new LiteralControl(@"<div class=""cellRight"">"));
  TextBox description = new TextBox();
  description.ID = columnName;
  description.Text = columnComment;
  description.Width = Unit.Pixel(500);
  description.Height = Unit.Pixel(40);
  description.TextMode = TextBoxMode.MultiLine;
  description.ToolTip = string.Format("Describe the purpose of [{0}] and any special rules that may apply to it.", columnName);
  description.MaxLength = (int)ExtendedProperty.SizeOf.Value;
  description.Attributes.Add("runat", "server");
  ColumnPanel.Controls.Add(description);
  ColumnPanel.Controls.Add(new LiteralControl("</div>"));

  this.Panel1.Controls.Add(new LiteralControl("</div>"));

}

Using the Utility

homePage.png

Pick your database, the table(s) you're interested in and the output format.

Table NameA table name or a SQL pattern such as %customer%. Leave blank to get all tables and views in the selected database.
FormatSelects the transform that will be used to generate the output. With the exception of sp_help all formats will prompt you to download the output. The sp_help format takes you to a page like that shown below .
Show Hidden TablesDisplay tables marked by the utility as "hidden". Information for tables so marked will not normally be retrieved when requesting schemata. This can be useful if there are sensitive tables details of which should not be included in any documentation being created for third parties.

spHelp.png

The "sp_help" page

updateSchema.png

The update table information page

Sample C# Output

C#
using System.Reflection;

namespace database.lbs
{

 /// <summary>
 /// Column list for table.
 /// </summary>
 public class Address
 {
   /// <summary>
   /// Return the table's name. Address
   /// </summary>
   /// <returns />
   public static string _Self()
   {
     return "Address";
   }

   /// <summary>
   /// Return database size of named column.
   /// </summary>
   /// <returns />
   public static int _SizeOf(string columnName)
   {
     Address instance = new Address();

     FieldInfo f = instance.GetType().GetField(string.Format("siz_{0}", columnName));
     return Convert.ToInt32(f.GetValue(instance));
   }

   /// <summary>
   /// Return SQL database type of named column.
   /// </summary>
   /// <returns />
   public static string _TypeOf(string columnName)
   {
     Address instance = new Address();

     FieldInfo f = instance.GetType().GetField(string.Format("typ_{0}", columnName));
     return Convert.ToInt32(f.GetValue(instance));
   }

   public const string ID = "ID";
   private const string typ_ID = "uniqueidentifier";
   private const int siz_ID = 16;
   public const string PostTown = "PostTown";
   private const string typ_PostTown = "varchar";
   private const int siz_PostTown = 20;
   public const string Postcode = "Postcode";
   private const string typ_Postcode = "varchar";
   private const int siz_Postcode = 15;
   public const string Building = "Building";
   private const string typ_Building = "varchar";
   private const int siz_Building = 30;
   public const string Block = "Block";
   private const string typ_Block = "varchar";
   private const int siz_Block = 150;
 }

}

Examples of Intended Use

C#
// Extracting data from ADO.Net data structures
string postcode = myDataViewRow(Address.Postcode).ToString();

// Restricting input length for string data.
txtPostcode.MaxLength = Address._Sizeof(Address.Postcode);

Limitations and Issues to Consider

Database Access

In small organisations or for those of us tinkering at home this isn't likely to be a problem. However as given here the utility assumes that the database is configured to accept SQL as well as Windows logins.

Extended Properties

Table and Column extended properties require "alter table" privileges on each table being updated. As with database access above this is unlikely to be an issue in small organisations, but could be problematic in larger organisations with a tightly controlled DBA / Developer split.

Changes to Views

We have to bear in mind that MS may change the names, structure and content of any of the views used to retrieve the information as MS-SQL is updated.

Known Limitations

  • The utility uses Session for moving data between pages. So watch out for session timeout and other session related vulnerabilities.
  • There is no validation of user input. We assume (I know, I know) this is for use by sensible types.

Points of Interest

Non-breaking spaces are used in HTML output to ensure that empty table columns are shown. They may not be necessary in the document format but have been retained, just to be certain.

For reasons I have yet to work out Open Office Writer (3.2.0) treats HTML documents with an HTML doctype tag as plain text if the encoding of the transformed and downloaded output is Unicode. Setting the encoding to UTF-8 "cures" the problem.

Conclusion

It is surprising how effective even something as crude as the utility described here can be. The original version of this tool has proved very effective in the following ways:

  • Significant reduction in the effort required to get to grips with unfamiliar or infrequently modified parts of a database because the purpose and use of columns in tables is explicit and no longer has to be inferred from the column's name and the data held in the table
  • Prevention of runtime errors through use of auto-generated constants instead of string literals when accessing data retrieved using ADO.Net
  • Significant reduction in the time taken to create design and customer documentation.

As given here the utility is suitable for use by small teams responsible for all aspects of application development including DB setup and administration. However the background information and sample code should provide a good starting point for further development should it not be suitable as is.

History

DateRemarks
Dec 2005Original for SQL-7 & SQL-2000
Jan 2012Rewrite.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)