Introduction
In versions of MS-SQL prior to 2005 there was no programmable way of allowing the addition of additional information such as descriptions to tables or columns. Attempts to address this problem usually resulted in solutions that either added one or more information tables to each database or held the information in a completely separate store. In either case it is too easy for the additional information to get out of step with the parent DB. The introduction in MS-SQL 2005 of extended properties for database objects and the provision of stored procedures to manipulate them goes a long way to solving this problem.
This article presents some very simple classes to extract, format and update data dictionary information using extended procedures for MS-SQL databases together with a bare bones web UI making use of them. It is based on the recent reworking of a utility originally I wrote a little over 6 years ago. The original was restricted by having to work with SQL-7 and suffered from some of the problems noted above.
The main requirements that drove the development of the original version were to provide:
- Provide a more readable summary of the information available from sp_help, preferably browser based
- Allow the maintenance of comments/descriptions for table columns, ditto.
As presented this utility allows for the maintenance of the following information:
- Table
- Hidden
- - Hides the table from the documentation utility.
- Version
- - The DB version that the table was introduced in
- Comment
- - A description of the table's purpose,
- Column
- Comment
- - A description of the column's purpose,
Background
SQL Server Extended Properties
Introduced with SQL2K5 extended properties allow you to associate sql_variant
values containing up to 7,500 bytes of data with objects in a MS-SQL server database.
Extended Properties have a number of advantages. Those that make them particularly suited to a data dictionary application are:
- They are part of the database and are backed up and restored with the database.
- Automatically dropped when the parent object (table, view, column) is dropped.
- Can be viewed using SQL Server Management Studio
Extended Properties are maintained using three stored procedures:
sys.sp_addextendedproperty
sys.sp_updateextendedproperty
sys.sp_deleteextendedproperty
Unfortunately these stored procedures are a little "unfriendly" in their behaviour and it is necessary to check for the existence of the property you wish to add, update or delete before calling them.
information_schema and other views
These views allow us to examine DB structure without having to resort to the sysobjects
table as was the case in earlier versions of MS SQL Server. For this piece we're interested in the following:
sys.extended_properties
- - Lists all defined extended properties in the current DB.
sys.information_schema.tables
- - Lists all tables and views in the current DB.
sys.information_schema.columns
- - Lists column information for views and tables in the current DB.
sys.information_schema.table_constraints
- - Lists constraints, such as primary keys, for columns and tables.
sys.views
- - Lists views in the DB. Allows us to distinguish views from tables in
information_schema.tables
and to retrieve object IDs for table, view and column names. sys.tables
- - Lists tables in the DB. Serves the same purpose as
sys.views
.
XML in T-SQL
This is old hat now, but it allows us to generate a variety of output formats using some simple XSLT.
Using the Code
Class: Query
This has a number of static methods returning T-SQL scripts. The T-SQL is embedded in the application because we don't want to have to add anything to a DB in order for any application that we may write to run.
Method | Scope | Returns |
GetSchemata | public static | string The T-SQL necessary to get schemata for one or more tables. |
GetSingleTable | public static | string The T-SQL necessary to get a schema for a single table. Little more than a call forwarder to GetSchemata |
ColumnList | public static | string The T-SQL necessary to get a list of columns for a single table. |
UpdateColumnProperty | public static | string The T-SQL necessary to add or update an extended property for a table column. |
UpdateTableProperty | public static | string The T-SQL necessary to add or update an extended property for a table or a view. |
The scripts represented by these methods, especially GetSchemata
, determine the extended properties that we create and update.
GetSchemata
For both the table and column properties we extract the extended properties from the sys.extended_properties
view using a couple of derived tables. Should you want to maintain more extended properties then you'd start your modifications here. Were you so minded the number and name of extended properties available could be made configurable.
Table Properties
...
left join
(select
major_id,
max(case when [name]='comment' then [value] else '' end) as comment,
max(case when [name]='hidden' then [value] else '0' end) as hidden,
max(case when [name]='version' then [value] else '' end) as version
from sys.extended_properties
where class = 1 and minor_id = 0
group by major_id
) as [tableProps]
...
Column Properties
...
left join
(select
major_id, minor_id,
max(case when [name]='comment' then [value] else '' end) as comment,
max(case when [name]='version' then [value] else '' end) as version
from sys.extended_properties
where class = 1
group by major_id, minor_id
) as [columnProps]
...
It's worth noting that that columns are identified by the same major_id
as their parent table and a minor_id
. The minor_id is simply the column's index.
The output from GetSchemata
is one or more rows of XML formatted fragments. These fragments are concatenated to give an XML fragment like that shown below. This allows us to use XSL to generate a number of different output formats.
<Tables Name="lbs">
<table Name="Bike" IsView="0">
<tableProps Comment="Description of a specific bike linking to owner and manufacturer."
Hidden="0"
Version="1.0.0">
<column Name="ID" PrimaryKey="1" Type="uniqueidentifier" Length="16" AllowNull="0" DefaultValue="newid()">
<columnProps Comment="Internal Primary Key GUID"
Version=""/>
</column>
<column Name="CustomerID" PrimaryKey="0" Type="uniqueidentifier" Length="16" AllowNull="1">
<columnProps Comment="Owner ID. If bike not stock"
Version=""/>
</column>
:
:
:
:
</tableProps>
</table>
:
:
</Tables>
UpdateColumnProperty, UpdateTableProperty
As far as I am aware there is no bulk update method for extended properties. A stored procedure call is required for each addition, removal or update. As a Data Dictionary utility is not likely to generate huge numbers of updates, assuming most tables rarely have more than a dozen or so columns and that a user can only update a single table at a time, this is unlikely to cause serious performance problems.
The T-SQL returned by these methods is straight-forward if ugly. A select to determine whether or not the property is already associated with the table or column directing a call to the add or update stored procedure as required
You will notice that the schema that owns the table is hard coded as 'dbo'. This is appropriate for the databases I tinker with at home. If you adopt this code then you may want to recover the schema name from the sys.schemas
view.
Obtaining a table's schema name
select
sys.schemas.name as [schemaName]
from sys.schemas
inner join (select [name], schema_id from sys.tables
union select [name], schema_id from sys.views)
as systable
on systable.schema_id = sys.schemas.schema_id
and systable.name = 'Bike'
Class: XML
This simply wraps up the calls necessary to apply an XSL transform to an XML string.
Method | Scope | Returns | Comment |
ApplyTransform | public static | string | Returns a string containing the transform output. |
ApplyTransform | public static | void | Writes the transform output to a file. |
XSLT Templates
Five templates are included by way of demonstration.
Filename | Output | Comment |
ToHTML.xsl | Generates a simple HTML page giving an sp_help like description of the table. | Provides a hyperlink to an ASPX page where the user can update the table's extended properties. |
ToDOC.xsl | A "document" format for use with Open Office Writer or MS Word. | This is a cheat. The output is a simplified variation on the ToHTML template with a doctype tag which allows Writer or Word to open the "document" as formatted text. It saves a lot of hard work creating RTF and, as both Writer and MS Word allow documents to be saved in a variety of formats including PDF, it deals neatly with the problem of supplying read only copies of database documentation to third parties. |
ToCSharp.xsl | A C# class providing access to the table name, its column names, the datalength and SQL data type of the each column. | Less useful than in Framework 1.1 now that database object models are so easily generated, but still handy for working with DataTables and DataViews . |
ToVB.xsl | A VB equivalent of ToCSharp.xsl | As C# output. |
ToSQL.xsl | A T-SQL script to add all defined extended properties to tables. | Intended for bulk setting up of data on creation of a new version of a database where no extended properties have yet been set. |
Each template follows the same extremely simple pattern:
<xsl:template match="/">
<xsl:apply-templates select="Tables/table">
<xsl:sort select="Name"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="table">
<xsl:apply-templates select="tableProps/column"/>
</xsl:template>
<xsl:template match="tableProps/column">
</xsl:template>
</xsl:stylesheet>
LINQ for XML - XElement Extension Class
The ASPX page that allows the user to update the table's extended properties is assembled on the fly using the output from Query::GetSingleTable
and LINQ for XML. This proved quite straight-forward apart from a minor irritation; the XElement::Attribute
method throws a null exception if there is no value for the named attribute. To keep the code easy(ish) to follow some XElement
extension methods were found to be necessary.
Method | Comment |
AttributeString | Returns a string value for the named attribute. Empty string if no alternative default value supplied. |
AttributeBool | Returns a boolean value for the named attribute. False if no alternative default value supplied. |
AttributeValue | Returns an object for the named attribute. A default must be supplied when the call is made. |
UI Notes
The UI is very simple. It has only 4 main components
Filename | Comment |
default.aspx | A login page. |
sp_help | Displays schemata for all selected tables. There is no aspx or other file associated with this page. It is created on request. |
dataDic.aspx | Allows the update of data dictionary information for a single table. |
loadSchema.aspx | No visible elements. Accepts requests from default.aspx and generates the required output. |
dataDic.aspx
This page takes the XML description for one table and using LINQ for XML creates aspx text controls for each extended property associated with the table. As written it expects to create controls and retrieve information for only the attributes given above. It should be relatively straight-forward to modify it to cope with a configurable list of extended properties.
The ASP generation code is very simple...
var query = from column in info.Elements("table").Elements("tableProps").Elements("column")
select column;
Panel ColumnPanel = new Panel();
ColumnPanel.ScrollBars = ScrollBars.Vertical;
ColumnPanel.Height = Unit.Pixel(400);
this.Panel1.Controls.Add(new LiteralControl(@"<hr />"));
this.Panel1.Controls.Add(ColumnPanel);
foreach(XElement column in query)
{
ColumnPanel.Controls.Add(new LiteralControl(@"<div class=""columnInfo"">"));
string columnName = column.AttributeString("Name", "!unnamed!");
string columnComment = "";
ColumnPanel.Controls.Add(new LiteralControl(@"<div class=""cellLeft"">"));
ColumnPanel.Controls.Add(new LiteralControl(columnName));
ColumnPanel.Controls.Add(new LiteralControl(@"</div>"));
var qryText = from comment in column.Elements("columnProps")
select comment;
foreach(var comment in qryText)
{
columnComment = comment.AttributeString("Comment", "");
}
ColumnPanel.Controls.Add(new LiteralControl(@"<div class=""cellRight"">"));
TextBox description = new TextBox();
description.ID = columnName;
description.Text = columnComment;
description.Width = Unit.Pixel(500);
description.Height = Unit.Pixel(40);
description.TextMode = TextBoxMode.MultiLine;
description.ToolTip = string.Format("Describe the purpose of [{0}] and any special rules that may apply to it.", columnName);
description.MaxLength = (int)ExtendedProperty.SizeOf.Value;
description.Attributes.Add("runat", "server");
ColumnPanel.Controls.Add(description);
ColumnPanel.Controls.Add(new LiteralControl("</div>"));
this.Panel1.Controls.Add(new LiteralControl("</div>"));
}
Using the Utility
Pick your database, the table(s) you're interested in and the output format.
Table Name | A table name or a SQL pattern such as %customer%. Leave blank to get all tables and views in the selected database. |
Format | Selects the transform that will be used to generate the output. With the exception of sp_help all formats will prompt you to download the output. The sp_help format takes you to a page like that shown below .
|
Show Hidden Tables | Display tables marked by the utility as "hidden". Information for tables so marked will not normally be retrieved when requesting schemata. This can be useful if there are sensitive tables details of which should not be included in any documentation being created for third parties. |
The "sp_help" page
The update table information page
Sample C# Output
using System.Reflection;
namespace database.lbs
{
public class Address
{
public static string _Self()
{
return "Address";
}
public static int _SizeOf(string columnName)
{
Address instance = new Address();
FieldInfo f = instance.GetType().GetField(string.Format("siz_{0}", columnName));
return Convert.ToInt32(f.GetValue(instance));
}
public static string _TypeOf(string columnName)
{
Address instance = new Address();
FieldInfo f = instance.GetType().GetField(string.Format("typ_{0}", columnName));
return Convert.ToInt32(f.GetValue(instance));
}
public const string ID = "ID";
private const string typ_ID = "uniqueidentifier";
private const int siz_ID = 16;
public const string PostTown = "PostTown";
private const string typ_PostTown = "varchar";
private const int siz_PostTown = 20;
public const string Postcode = "Postcode";
private const string typ_Postcode = "varchar";
private const int siz_Postcode = 15;
public const string Building = "Building";
private const string typ_Building = "varchar";
private const int siz_Building = 30;
public const string Block = "Block";
private const string typ_Block = "varchar";
private const int siz_Block = 150;
}
}
Examples of Intended Use
string postcode = myDataViewRow(Address.Postcode).ToString();
txtPostcode.MaxLength = Address._Sizeof(Address.Postcode);
Limitations and Issues to Consider
Database Access
In small organisations or for those of us tinkering at home this isn't likely to be a problem. However as given here the utility assumes that the database is configured to accept SQL as well as Windows logins.
Extended Properties
Table and Column extended properties require "alter table" privileges on each table being updated. As with database access above this is unlikely to be an issue in small organisations, but could be problematic in larger organisations with a tightly controlled DBA / Developer split.
Changes to Views
We have to bear in mind that MS may change the names, structure and content of any of the views used to retrieve the information as MS-SQL is updated.
Known Limitations
- The utility uses Session for moving data between pages. So watch out for session timeout and other session related vulnerabilities.
- There is no validation of user input. We assume (I know, I know) this is for use by sensible types.
Points of Interest
Non-breaking spaces are used in HTML output to ensure that empty table columns are shown. They may not be necessary in the document format but have been retained, just to be certain.
For reasons I have yet to work out Open Office Writer (3.2.0) treats HTML documents with an HTML doctype tag as plain text if the encoding of the transformed and downloaded output is Unicode. Setting the encoding to UTF-8 "cures" the problem.
Conclusion
It is surprising how effective even something as crude as the utility described here can be. The original version of this tool has proved very effective in the following ways:
- Significant reduction in the effort required to get to grips with unfamiliar or infrequently modified parts of a database because the purpose and use of columns in tables is explicit and no longer has to be inferred from the column's name and the data held in the table
- Prevention of runtime errors through use of auto-generated constants instead of string literals when accessing data retrieved using ADO.Net
- Significant reduction in the time taken to create design and customer documentation.
As given here the utility is suitable for use by small teams responsible for all aspects of application development including DB setup and administration. However the background information and sample code should provide a good starting point for further development should it not be suitable as is.
History
Date | Remarks |
Dec 2005 | Original for SQL-7 & SQL-2000 |
Jan 2012 | Rewrite. |