Rapid development and refactoring using code generators - Part 2

Chris Cole

3.14/5 (7 votes)

2 Aug 200512 min read

273

Using CodeDOM and SQL Server to build entity and factory classes - Part 2: Building Entity and Factory classes from a database.

Introduction

Well designed object models increase reliability and code reuse. We all know that. But building and maintaining them can be a pain, especially in a changing environment. In this three-part series, we'll look at how rolling your own code generators can speed your development and take the fear and bite out of rapid refactoring.

Part 1 - Generating classes with CodeDom
Part 2 - Using code generators to build entity and factory classes from a database
Part 3 - Subclassing to allow rapid refactoring

In this, the second part in our three-part series, we'll be enhancing our simple code generator from the previous article to actually do something useful - automate the building of entity and factory classes. For this article and the subsequent one, we'll be using the NorthWind sample database on Microsoft SQL Server.

What's the Destination?

In the case of our tool, it's the destination, not the journey. We have to understand our end target and work backwards from that to create something that gets us there. In our case:

Factory and Entity class

we're making the assumption that we have a well designed data model having a direct correlation between many of the database tables and entities used in the application. In addition to building the entity code, we will also apply a factory pattern and generate the factory class.

We'll work from a simplified version of my normal entity-factory design both for brevity and so no one feels too compelled to start a debate on the finer points of class design (and so you still have a reason to pay for my consulting). I'll leave it to you, the reader, to fine tune things to your own liking. For now, suspend disbelief and let's work with my model. I am going to throw in one goodie lifted directly from Java EJB entity beans: primary key objects. In our case, we'll be working from a common IPrimayKey interface that I'll just provide directly below.

Simple Entity class

This is an abbreviated listing of a simple entity with the entity-specific primary key implementation excluded for brevity.

using System;

namespace ExampleNamespace
{
    /// <summary>
    /// Summary description for SimpleEntity.
    /// </summary>
    public class SimpleEntity
    {
        private int simpleID;
        private string name;
        private IPrimaryKey primaryKey;

        #region Entity Constructors
        public SimpleEntity()
        {
            this.Clear();
        }
        public SimpleEntity(int simpleID, string name) : this()
        {
            this.SimpleID = simpleID;
            this.Name = name;
        }
        #endregion

        #region Property values
        public int SimpleID
        {
            get { return this.simpleID; }
            set
            {
                this.simpleID = value;
                this.primaryKey = new SimplePrimaryKey(value);
            }
        }
        public string Name
        {
            get { return this.name; }
            set    { this.name = value; }
        }
        #endregion

        public void Clear()
        {
            primaryKey = new SimplePrimaryKey();
            this.name = null;
        }

        public bool IsNew()
        {
            return this.primaryKey.IsNew();
        }


        /// <summary>
        /// Entity specific primary key class
        /// </summary>
        class SimplePrimaryKey : IPrimaryKey ...
    }
}

Simple EntityFactory class

Below is a simple factory to build entities from. In a real-world application, you would use stored procedures, or at the very least, parameterized SqlCommand statements, but for ease of reading and to demonstrate the concepts, we'll just be using inline-built SQL statements.

using System;
using System.Collections;
using System.Data;


namespace ExampleNamespace
{
    /// <summary>
    /// Controls creation of SimpleEntity objects
    /// </summary>
    public class SimpleEntityFactory
    {
        public SimpleEntityFactory()
        {}

        /// <summary>
        /// Root column select statement
        /// </summary>
        protected string SimpleEntitySelect()
        {
            string sql;
            sql = "SELECT SimpleID, Name ";
            return sql;
        }

        /// <summary>
        /// Convert a DataRow into an object
        ///   by filling in the property values
        /// </summary>
        protected SimpleEntity EntityFromTableRow(DataRow row)
        {
            SimpleEntity obj;
            if(row != null)
            {
                obj = new SimpleEntity(Convert.ToInt32(row[0]), 
                                     Convert.ToString(row[1]));
            }
            else
            {
                obj = new SimpleEntity();
            }
            return obj;
        }

        /// <summary>
        /// Return a SimpleEntity by looking it up based on primary key
        /// </summary>
        public SimpleEntity FindByPrimaryKey(int key)
        {
            string sql = this.SimpleEntitySelect() + 
                " FROM Simple WHERE SimpleID='" + key + "';";
            DataTable table = DataAccess.ExecuteSqlSelect(sql);
            return this.EntityFromTableRow(table.Rows[0]);
        }

        /// <summary>
        /// Returns a collection of objects representing all current records
        /// </summary>
        public ICollection GatherAllSimple()
        {
            string sql = this.SimpleEntitySelect() + " FROM Simple ;";
            DataTable table = DataAccess.ExecuteSqlSelect(sql);
            ArrayList lst = new ArrayList();
            for(int i = 0; i < table.Rows.Count; i++)
            {
                lst.Add(this.EntityFromTableRow(table.Rows[i]));
            }
            return lst;
        }

        /// <summary>
        /// Save the entity to database by creating new or updating existing
        /// </summary>
        public SimpleEntity Save(SimpleEntity entity)
        {
            string sql;
            if(entity.IsNew())
            {
                sql = "INSERT INTO Simple(Name) VALUES('" + entity.Name + "');";
                DataAccess.ExecuteSqlNonQuery(sql);
                DataTable id = 
                   DataAccess.ExecuteSqlSelect("SELECT scope_identity();");
                entity.SimpleID = Convert.ToInt32(id.Rows[0][0]);
                return entity;
            }
            else
            {
                sql = "UPDATE Simple SET Name = '" + entity.Name + "';";
                DataAccess.ExecuteSqlNonQuery(sql);
                return entity;
            }
        }
    }
}

IPrimaryKey common interface

This interface is implemented by each entity object we build. The basic notion is derived from how Entity Java Beans (EJB) manage primary key representations. It allows us to support compound primary keys with a common interface and determine when an entity object is new and/or is currently in the database by only looking at the primary key.

using System;

namespace ExampleNamespace
{
    /// <summary>
    /// Primary key interface to be implemented by each entity object
    /// </summary>
    interface IPrimaryKey
    {
        /// <summary>
        /// Returns true if the primary key is uninitialized,
        /// false if the primary key has a valid value
        /// </summary>
        bool IsNew();

        /// <summary>
        /// Resets the primary key to an uninitialized state
        /// </summary>
        void Clear();
    }
}

Supporting Classes

Additionally, we need a class to facilitate data access (and potentially abstract base entities, factories, or interfaces). One simple data access example is included in the source code for this article. Some approaches include creating a base abstract factory to provide this functionality or having a separate known data access tier. Actual source code may be viewed in the associated Zip, but I'll not bore you with such mundaneness here.

The Journey

Reading the table structure

Now that we know where we're going, it's easier to see that path to get there. In our case, we're going to link to the database, read the table structure, and then build appropriate classes based on that structure. In the previous article, we built a foundation for the entity class and a simple Windows form to drive the class builder. We'll be extending that application to get to our destination.

Getting the data structure

Our first step is to load the user tables from our chosen database, and the fields from our chosen table. Below are the two SQL statements used. You can see them in context in the full source code.

string sqlForTables = "select [name] 'name' from" + 
   " sysobjects where xtype='U' and LEFT([name], 2) <> 'dt';";
string sqlForColumns = "select * from " + tablename + "where 1=0;";

After selecting the table structure into a DataTable named results, we iterate the DataTable.Columns collection and build a new table with column names and data types. This table will be used to construct our properties.

for(int i=0; i<results.Columns.Count; i++)
{
    DataColumn col = results.Columns[i];
    columnTable.Rows.Add
        (new object[]{col.ColumnName, col.DataType.ToString()});
}

And finally we need to find the primary key(s) for the table and place them in a HashTable which is then compared to the fields in our columnTable to mark the ones which are part of the primary key.

string sql = "sp_pkeys @table_name=" + tablename;

...

//Load the pkFields array
this.pkFields = new Hashtable();
for(int i=0; i<results.Rows.Count; i++)
{
    string colname = (string)results.Rows[i]["COLUMN_NAME"];
    pkFields.Add(colname, colname);
}

At this point we have the data structure we will be working from. We also trapped the table name and removed spaces to create the entityName value in our class builder and the entityNamePrimaryKey value for referencing our primary key class. Now we're ready to extend our tool to build some code around it.

Building the Entity Code

For the entity, we need to iterate the columns and use them as the basis for creation of:

Internal Fields
Properties
Constructor
PrimaryKey object
Clear method

In order to do this, we'll expand on our SimpleProperty method and add several new methods for creation of the other items outlined above. For each item, the code will iterate our columnTable and perform the desired action. You may choose to optimize this, but it's hardly necessary for this tool. The convention we'll use is columnName starting with lower case for internal fields, and ColumnName starting with upper case for external properties, regardless of capitalization within field names.

Properties

First of all, duplicate the SimpleProperty method and rename it to EntityProperty. Add an additional parameter isPrimary and change the type parameter to accept the type as a string. Final code will look as below:

CodeMemberProperty EntityProperty(string propName, 
          string internalName, string typeName, bool isPrimary)
{
    CodeMemberProperty prop = new CodeMemberProperty();
    prop.Name = propName;
    prop.Comments.Add(new 
         CodeCommentStatement("Property comment for " + propName));
    prop.Attributes = MemberAttributes.Public;
    prop.Type = new CodeTypeReference(typeName);
    prop.HasGet = true;
    prop.GetStatements.Add(
        new CodeMethodReturnStatement(
        new CodeFieldReferenceExpression(new 
            CodeThisReferenceExpression(), internalName)));

    prop.HasSet = true;
    prop.SetStatements.Add(
        new CodeAssignStatement(
        new CodeFieldReferenceExpression(new 
            CodeThisReferenceExpression(), internalName),
        new CodePropertySetValueReferenceExpression()));

    //if part of primary key, recreate the primaryKey object
    if(isPrimary)
    {
        CodeObjectCreateExpression pkCreate =
            new CodeObjectCreateExpression(this.primaryKeyClassname,
            new CodeExpression[]{new CodePropertySetValueReferenceExpression()});

        prop.SetStatements.Add(
            new CodeAssignStatement(
            new CodeFieldReferenceExpression(new 
                CodeThisReferenceExpression(), "primaryKey"),
            pkCreate));
    }

    return prop;
}

The main change is that we now have additional logic in the setter routine. If a field is a primary key the set routine will create a new primaryKey with the new value. The astute reader will notice that this design only supports single-field primary keys. I leave it to you, the reader, to work out the required logic for multi-value primary keys. I wouldn't spend too much time on it, though. In real-world transactional databases, compound primary keys are most often used in linking tables for many-to-many relationships. In an entity class structure, this will translate to either a bi-directional hierarchy or two separate hierarchies, allowing the data access layer to infer the composite primary key when communicating with the database.

Constructors

An additional constructor is added that allows the entire entity object to be instantiated and populated in a single call. This is critical when building collections of like objects from a SQL result set, since it translates into one call/row instead of 20 calls/row (or whatever).

/// Generates a CodeConstructor for creating and populating
/// an entity in one call
CodeConstructor AssignmentConstructor()
{
    CodeConstructor cconAssign = new CodeConstructor();
    cconAssign.Attributes = MemberAttributes.Public;
    cconAssign.Comments.AddRange(SummaryComments("Constructor" + 
                         " allowing population of properties"));
    for(int i=0; i < this.columnTable.Rows.Count; i++)
    {
        string fldName = this.ToInternalName((string)columnTable.Rows[i][0]);
        string fldType = (string)columnTable.Rows[i][1];

        //Constructor parameter
        cconAssign.Parameters.Add(new 
           CodeParameterDeclarationExpression(fldType, fldName));
        //Constructor value assignment statement
        cconAssign.Statements.Add(
            new CodeAssignStatement(
            new CodeFieldReferenceExpression(new 
                CodeThisReferenceExpression(), fldName),
            new CodeArgumentReferenceExpression(fldName)));
    }
    return cconAssign;
}

PrimaryKey object

Each entity will have an implementation of the IPrimaryKey interface. This allows the factory classes to have common interactions with the entity objects. In the design I actually use in production, my entity factories share a common abstract base class (as do the entities) that holds some common functions. In this simplified article, I have eliminated this, but the IPrimayKey interface gives a clue as to how some of this works in a real implementation.

Below is our method that makes an entity-specific implementation of our IPrimaryKey interface. It's a fairly simple class. The only methods, IPrimaryKey.IsNew and IPrimaryKey.Clear are fairly straightforward. If you look at the PKIsNew and PKClearMethod functions, the methods they generate are checking and resetting the internal primary key field to an "Uninitialized" state. This uninitialized value is picked from a set of defaults. In the case of Identity (integer) fields, the default is zero. Using a value of zero as the identity seed would obviously wreck havoc with this, so the uninitialized value would have to be adjusted accordingly. For character, string (varchar, nvarchar, etc.), and UniqueIdentifiers, this value is null.

CodeTypeDeclaration BuildPrimaryKeyClass()
{
    //Build the class declaration and member variables
    CodeTypeDeclaration pkclass = new CodeTypeDeclaration();
    pkclass.IsClass = true;
    pkclass.Name = this.primaryKeyClassname;
    pkclass.Attributes = MemberAttributes.FamilyAndAssembly;
    pkclass.BaseTypes.Add("IPrimaryKey");

    //Add comments for class
    pkclass.Comments.AddRange(
        this.SummaryComments("Implementation of primary key" + 
                                  " object for this entity"));

    //Add fields
    //Cycle columnTable and build property methods
    for(int i=0; i < this.columnTable.Rows.Count; i++)
    {
        if((string)columnTable.Rows[i][2]=="X")
        {
            string internalFldName = 
                   this.ToInternalName((string)columnTable.Rows[i][0]);
            string externalFldName = this.ToExternalName(internalFldName);
            string propType = (string)columnTable.Rows[i][1];
            pkclass.Members.Add(FieldVar(internalFldName, 
                            propType, MemberAttributes.Private));
            pkclass.Members.Add(this.SimpleProperty(externalFldName, 
                            internalFldName, propType));
        }
    }

    //Constructors
    pkclass.Members.Add(this.PrimaryKeyDefaultConstructor());
    pkclass.Members.Add(this.PrimaryKeyAssignmentConstructor());

    //Methods
    pkclass.Members.Add(this.PKIsNew());
    //Methods
    pkclass.Members.Add(this.PKClearMethod());

    return pkclass;
}

Once again this implementation doesn't actually support compound primary keys, but it does provide for a common, field neutral interface for interaction with the primary key.

Building the Factory Code

The factory code is actually much simpler than the entity code. We only have two public methods and it's mostly a matter of spitting out the appropriate SQL. For the factory, we iterate the columns and use them for creation of:

FindByPrimaryKey method (3 lines of code)
GatherAll method (8 lines of code)

Sounds pretty simple, doesn't it. In fairness, there are two supporting functions totaling and additional 12 lines of code inclusively. The basic attack pattern is the same as above: work out the target code and methods, include statically named sections directly, and iterate the columns to generate your field references (or SQL strings in this case).

The class factory build method BuildFactoryClass has much the same structure as the BuildClass method, handling code generator and file stream creation, as well as writing the code. Each target method is represented by a separate build method returning a CodeMemberMethod object. Originally, I had wanted these methods to be public static. Unfortunately, the CodeType.Attributes does not support multiple attributes through its MemberAttributes enumeration. Nor does it support collections of attributes. So I had to settle on public instance methods instead.

EntityFromTableRow

The first method we'll look at is the one that builds the method to convert a DataRow into an entity object. It's really the workhorse of the factory class. You might expect this to be fairly difficult to write, but it actually wasn't too bad. We'll just look at a couple parts of it. Below is where we iterate the columns and build them out as arguments to our entity constructor.

CodeObjectCreateExpression objcreate = new CodeObjectCreateExpression();
objcreate.CreateType = new CodeTypeReference(this.entityName);
for(int i=0; i < this.columnTable.Rows.Count; i++)
{
    string fldName = this.ToInternalName((string)columnTable.Rows[i][0]);
    string fldType = (string)columnTable.Rows[i][1];
    CodeMethodInvokeExpression param =
        new CodeMethodInvokeExpression(new CodeSnippetExpression("Convert"),
        this.ConversionMethodName(fldType),
        new CodeExpression[]{new CodeSnippetExpression("row[" 
                                     + i.ToString() + "]")});

    objcreate.Parameters.Add(param);
}

In the above code, you will notice a call to an internal method this.ConversionMethodName. As we supply the arguments, we need to strongly type them. Back in the old classic ASP/VB COM days, when I first started playing around with entity code generators, this was an unnecessary step. Everything was a Variant (generic object for you newcomers), so typed or not it just got slapped into a generic variable at the end of the day. In fact, if you did type the parameters to your COM objects, you'd have to explicitly cast all the arguments within the ASP code or you'd get errors from here to Kingdom COM(e) {snicker}. So now that we do have strong typing (a huge improvement, BTW) we have to convert everything to the proper type. Casting doesn't work because you can't cast a reference type (i.e. object) to a primitive (ex: int). One has to make use of the System.Convert object methods. The method naming conventions are standardized well, so we can make a pretty good guess from the column data type.

string ConversionMethodName(string typeName)
{
    string tname = typeName.Substring(typeName.LastIndexOf(".") + 1);
    return "To" + tname;
}

GatherAll

The only other method we'll look at is the one to generate our GatherAll factory method. To refresh, here is our target code:

public static ICollection GatherAll()
{
    string sql = this.SimpleEntitySelect() + " FROM Simple ;";
    DataTable table = DataAccess.ExecuteSqlSelect(sql);
    ArrayList lst = new ArrayList();
    for(int i = 0; i < table.Rows.Count; i++)
    {
        lst.Add(this.EntityFromTableRow(table.Rows[i]));
    }
    return lst;
}

So our target is eight lines of code (11 if you count the method declaration). We know from above that public static is impossible with CodeDOM, so we'll just make it public. I'll save the suspense and just give you the code, then we'll talk about it.

CodeMemberMethod GatherAllMethod()
{
    CodeMemberMethod gatherMethod = new CodeMemberMethod();
    gatherMethod.Name = "GatherAll";
    gatherMethod.ReturnType = new CodeTypeReference("ICollection");

    gatherMethod.Comments.AddRange(
        this.SummaryComments("Returns a collection" + 
                " of all entities in the database"));

    gatherMethod.Attributes = MemberAttributes.Public;

    //Add sql declaration
    //TARGET CODE: string sql = this.SimpleEntitySelect() + 
                " FROM Simple WHERE SimpleID='" + key + "';";

    gatherMethod.Statements.Add(
        new CodeVariableDeclarationStatement(typeof(string), "sql",
        new CodeMethodInvokeExpression(new 
            CodeThisReferenceExpression(),
            "BaseEntitySelect",
            new CodeExpression[]{})));

    gatherMethod.Statements.Add(
        new CodeSnippetExpression("sql += \" FROM " + 
            this.ToSqlName(this.tableName) + "\" "));

    //Below generates this statement - much easier 
    //just to write as a string, 'eh?
    //TARGET CODE: DataTable table = DataAccess.ExecuteSqlSelect(sql);
    gatherMethod.Statements.Add(
        new CodeVariableDeclarationStatement(
           typeof(System.Data.DataTable), "table",
           new CodeMethodInvokeExpression(new 
           CodeSnippetExpression("DataAccess"), "ExecuteSqlSelect",
           new CodeExpression[]{new 
           CodeVariableReferenceExpression("sql")})));

    //TARGET CODE: ArrayList lst = new ArrayList();
    gatherMethod.Statements.Add(
      new CodeVariableDeclarationStatement("System.Collections.ArrayList", 
      "lst", new CodeObjectCreateExpression("System.Collections.ArrayList",
      new CodeExpression[]{})));

    //Some items just can't be written with CodeDOM
    //TARGET CODE: for(int i = 0; i < table.Rows.Count; i++)
    //            {
    //                lst.Add(this.EntityFromTableRow(table.Rows[i]));
    //            }

    //This is the component code:
    //lst.Add(this.EntityFromTableRow(table.Rows[i]));
    CodeStatement[] iterationCode = new CodeStatement[]
        {
            new CodeExpressionStatement(
                new CodeMethodInvokeExpression(new 
                    CodeVariableReferenceExpression("lst"), "Add",
                    new CodeExpression[]{
                        new CodeMethodInvokeExpression(new 
                            CodeThisReferenceExpression(), 
                                    "EntityFromTableRow",
                            new CodeExpression[]
                            {
                                new CodeArrayIndexerExpression(new 
                                    CodeMethodReferenceExpression(
                                new CodeVariableReferenceExpression("table"), 
                                "Rows"), new CodePrimitiveExpression(0))
                            }
                        )})
            )
        };


    //And this is the loop constructor
    gatherMethod.Statements.Add(
        new CodeIterationStatement(
            new CodeVariableDeclarationStatement(typeof(int), 
                "i", new CodePrimitiveExpression(0)),
            new CodeBinaryOperatorExpression(new 
                CodeVariableReferenceExpression("i"), 
                CodeBinaryOperatorType.LessThan,
                new CodeSnippetExpression("table.Rows.Count")),
            new CodeAssignStatement(new CodeVariableReferenceExpression("i"),
                new CodeSnippetExpression("i+1")), iterationCode));

    //TARGET CODE: return lst;
    gatherMethod.Statements.Add(
        new CodeMethodReturnStatement(
        new CodeVariableReferenceExpression("lst")));

    return gatherMethod;
}

Soooooo, to generate 8 lines of code, CodeDOM requires... (throw out the white space, ignore the comments, carry the 1)... a whopping 47 lines of code!!! I counted 41 objects (count of objects by counting calls to "new"). And it couldn't even be done without a few uses of the CodeSnippetExpression class. To do the math, that's 500% bloat. Did I mention my template based solution didn't even look at this method since it was common enough to be part of the base class?

Conclusion

Building a code generator for one entity is a lot of work. A whole lot of work. But it's the same amount of work as building a code generator for twenty entities. Or two hundred. On the flip side, building one entity directly is a fair amount of work. Building two is a bit more. Building 20 is ridiculous, and building 200 would be ludicrous. My basic rule of thumb for anything repetitive is thus: If you're going to do it twice, do it manually. If you may do it three times, break it out or write a tool to automate it.

Final comments on CodeDOM

At this point the weaknesses of CodeDOM as a practical tool should be self-evident. By abstracting all calls and references, you get a language structure that, to put it kindly, is non-obvious. Target source code methods and patterns are difficult to maintain and make changes to when compared to template-based approaches where the code exists in the target format within the template. Complex methods or logic routines are all but impossible to write without just writing the code directly as a CodeSnippet.

Another major problem with CodeDOM and how it generates code is its inability to have multiple attributes. I can see where the designers might want to restrict this so that users don't add mutually exclusive attributes to a given type (class, variable, etc.), but come on. Who hasn't needed a few public static methods and variables now and then?

In exchange for these weaknesses, you have a powerful abstraction layer for the target language, but not powerful enough to completely divorce yourself from your language syntax (see all the CodeSnippet objects used for relatively common code sequences). And you're confined to languages with CodeDOM support: i.e., VB.NET and C#.

After completing this stage of the project, I can say with the utmost confidence that I will not be using CodeDOM again in the near future. For the amount of bloat created by CodeDOM, I could hand code my generator to support three or four languages with a series of case statements. All is not lost, however. There are some projects out there to wrap CodeDOM in a way to make it actually work somewhat as promised. The NGenerate project sounds promising and there are several XML/XSLT based projects floating around. You'll have to do the research yourself.

But now we have a code generation tool built and can move on to the next stage: Packaging and Subclassing for Rapid Refactoring. Additional "Fancy Features" to consider building on your own time might include integrated logging and performance counters, exception handling, partial field updates, and stored procedure autogeneration and support. If you want to get crazy, you could use sp_fkeys to build entities with child-entity collections already included. Or not.

Preview Part III

In Part III, we'll show how to package and subclass your generated code for rapid refactoring. We'll code a simple web application and demonstrate how database schema changes can have little or no effect on your human-maintained code.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here