Introduction
Well designed object models increase reliability and code reuse. We all know that. But building and maintaining them can be a pain, especially in a changing environment. In this three-part series, we'll look at how rolling your own code generators can speed your development and take the fear and bite out of rapid refactoring.
In this, the second part in our three-part series, we'll be enhancing our simple code generator from the previous article to actually do something useful - automate the building of entity and factory classes. For this article and the subsequent one, we'll be using the NorthWind sample database on Microsoft SQL Server.
What's the Destination?
In the case of our tool, it's the destination, not the journey. We have to understand our end target and work backwards from that to create something that gets us there. In our case:
we're making the assumption that we have a well designed data model having a direct correlation between many of the database tables and entities used in the application. In addition to building the entity code, we will also apply a factory pattern and generate the factory class.
We'll work from a simplified version of my normal entity-factory design both for brevity and so no one feels too compelled to start a debate on the finer points of class design (and so you still have a reason to pay for my consulting). I'll leave it to you, the reader, to fine tune things to your own liking. For now, suspend disbelief and let's work with my model. I am going to throw in one goodie lifted directly from Java EJB entity beans: primary key objects. In our case, we'll be working from a common IPrimayKey
interface that I'll just provide directly below.
Simple Entity class
This is an abbreviated listing of a simple entity with the entity-specific primary key implementation excluded for brevity.
using System;
namespace ExampleNamespace
{
public class SimpleEntity
{
private int simpleID;
private string name;
private IPrimaryKey primaryKey;
#region Entity Constructors
public SimpleEntity()
{
this.Clear();
}
public SimpleEntity(int simpleID, string name) : this()
{
this.SimpleID = simpleID;
this.Name = name;
}
#endregion
#region Property values
public int SimpleID
{
get { return this.simpleID; }
set
{
this.simpleID = value;
this.primaryKey = new SimplePrimaryKey(value);
}
}
public string Name
{
get { return this.name; }
set { this.name = value; }
}
#endregion
public void Clear()
{
primaryKey = new SimplePrimaryKey();
this.name = null;
}
public bool IsNew()
{
return this.primaryKey.IsNew();
}
class SimplePrimaryKey : IPrimaryKey ...
}
}
Simple EntityFactory class
Below is a simple factory to build entities from. In a real-world application, you would use stored procedures, or at the very least, parameterized SqlCommand
statements, but for ease of reading and to demonstrate the concepts, we'll just be using inline-built SQL statements.
using System;
using System.Collections;
using System.Data;
namespace ExampleNamespace
{
public class SimpleEntityFactory
{
public SimpleEntityFactory()
{}
protected string SimpleEntitySelect()
{
string sql;
sql = "SELECT SimpleID, Name ";
return sql;
}
protected SimpleEntity EntityFromTableRow(DataRow row)
{
SimpleEntity obj;
if(row != null)
{
obj = new SimpleEntity(Convert.ToInt32(row[0]),
Convert.ToString(row[1]));
}
else
{
obj = new SimpleEntity();
}
return obj;
}
public SimpleEntity FindByPrimaryKey(int key)
{
string sql = this.SimpleEntitySelect() +
" FROM Simple WHERE SimpleID='" + key + "';";
DataTable table = DataAccess.ExecuteSqlSelect(sql);
return this.EntityFromTableRow(table.Rows[0]);
}
public ICollection GatherAllSimple()
{
string sql = this.SimpleEntitySelect() + " FROM Simple ;";
DataTable table = DataAccess.ExecuteSqlSelect(sql);
ArrayList lst = new ArrayList();
for(int i = 0; i < table.Rows.Count; i++)
{
lst.Add(this.EntityFromTableRow(table.Rows[i]));
}
return lst;
}
public SimpleEntity Save(SimpleEntity entity)
{
string sql;
if(entity.IsNew())
{
sql = "INSERT INTO Simple(Name) VALUES('" + entity.Name + "');";
DataAccess.ExecuteSqlNonQuery(sql);
DataTable id =
DataAccess.ExecuteSqlSelect("SELECT scope_identity();");
entity.SimpleID = Convert.ToInt32(id.Rows[0][0]);
return entity;
}
else
{
sql = "UPDATE Simple SET Name = '" + entity.Name + "';";
DataAccess.ExecuteSqlNonQuery(sql);
return entity;
}
}
}
}
IPrimaryKey common interface
This interface is implemented by each entity object we build. The basic notion is derived from how Entity Java Beans (EJB) manage primary key representations. It allows us to support compound primary keys with a common interface and determine when an entity object is new and/or is currently in the database by only looking at the primary key.
using System;
namespace ExampleNamespace
{
interface IPrimaryKey
{
bool IsNew();
void Clear();
}
}
Supporting Classes
Additionally, we need a class to facilitate data access (and potentially abstract base entities, factories, or interfaces). One simple data access example is included in the source code for this article. Some approaches include creating a base abstract factory to provide this functionality or having a separate known data access tier. Actual source code may be viewed in the associated Zip, but I'll not bore you with such mundaneness here.
The Journey
Now that we know where we're going, it's easier to see that path to get there. In our case, we're going to link to the database, read the table structure, and then build appropriate classes based on that structure. In the previous article, we built a foundation for the entity class and a simple Windows form to drive the class builder. We'll be extending that application to get to our destination.
Getting the data structure
Our first step is to load the user tables from our chosen database, and the fields from our chosen table. Below are the two SQL statements used. You can see them in context in the full source code.
string sqlForTables = "select [name] 'name' from" +
" sysobjects where xtype='U' and LEFT([name], 2) <> 'dt';";
string sqlForColumns = "select * from " + tablename + "where 1=0;";
After selecting the table structure into a DataTable
named results
, we iterate the DataTable.Columns
collection and build a new table with column names and data types. This table will be used to construct our properties.
for(int i=0; i<results.Columns.Count; i++)
{
DataColumn col = results.Columns[i];
columnTable.Rows.Add
(new object[]{col.ColumnName, col.DataType.ToString()});
}
And finally we need to find the primary key(s) for the table and place them in a HashTable
which is then compared to the fields in our columnTable
to mark the ones which are part of the primary key.
string sql = "sp_pkeys @table_name=" + tablename;
...
this.pkFields = new Hashtable();
for(int i=0; i<results.Rows.Count; i++)
{
string colname = (string)results.Rows[i]["COLUMN_NAME"];
pkFields.Add(colname, colname);
}
At this point we have the data structure we will be working from. We also trapped the table name and removed spaces to create the entityName
value in our class builder and the entityNamePrimaryKey
value for referencing our primary key class. Now we're ready to extend our tool to build some code around it.
Building the Entity Code
For the entity, we need to iterate the columns and use them as the basis for creation of:
- Internal Fields
- Properties
- Constructor
- PrimaryKey object
- Clear method
In order to do this, we'll expand on our SimpleProperty
method and add several new methods for creation of the other items outlined above. For each item, the code will iterate our columnTable
and perform the desired action. You may choose to optimize this, but it's hardly necessary for this tool. The convention we'll use is columnName
starting with lower case for internal fields, and ColumnName
starting with upper case for external properties, regardless of capitalization within field names.
Properties
First of all, duplicate the SimpleProperty
method and rename it to EntityProperty
. Add an additional parameter isPrimary
and change the type parameter to accept the type as a string. Final code will look as below:
CodeMemberProperty EntityProperty(string propName,
string internalName, string typeName, bool isPrimary)
{
CodeMemberProperty prop = new CodeMemberProperty();
prop.Name = propName;
prop.Comments.Add(new
CodeCommentStatement("Property comment for " + propName));
prop.Attributes = MemberAttributes.Public;
prop.Type = new CodeTypeReference(typeName);
prop.HasGet = true;
prop.GetStatements.Add(
new CodeMethodReturnStatement(
new CodeFieldReferenceExpression(new
CodeThisReferenceExpression(), internalName)));
prop.HasSet = true;
prop.SetStatements.Add(
new CodeAssignStatement(
new CodeFieldReferenceExpression(new
CodeThisReferenceExpression(), internalName),
new CodePropertySetValueReferenceExpression()));
if(isPrimary)
{
CodeObjectCreateExpression pkCreate =
new CodeObjectCreateExpression(this.primaryKeyClassname,
new CodeExpression[]{new CodePropertySetValueReferenceExpression()});
prop.SetStatements.Add(
new CodeAssignStatement(
new CodeFieldReferenceExpression(new
CodeThisReferenceExpression(), "primaryKey"),
pkCreate));
}
return prop;
}
The main change is that we now have additional logic in the setter routine. If a field is a primary key the set
routine will create a new primaryKey
with the new value
. The astute reader will notice that this design only supports single-field primary keys. I leave it to you, the reader, to work out the required logic for multi-value primary keys. I wouldn't spend too much time on it, though. In real-world transactional databases, compound primary keys are most often used in linking tables for many-to-many relationships. In an entity class structure, this will translate to either a bi-directional hierarchy or two separate hierarchies, allowing the data access layer to infer the composite primary key when communicating with the database.
Constructors
An additional constructor is added that allows the entire entity object to be instantiated and populated in a single call. This is critical when building collections of like objects from a SQL result set, since it translates into one call/row instead of 20 calls/row (or whatever).
CodeConstructor AssignmentConstructor()
{
CodeConstructor cconAssign = new CodeConstructor();
cconAssign.Attributes = MemberAttributes.Public;
cconAssign.Comments.AddRange(SummaryComments("Constructor" +
" allowing population of properties"));
for(int i=0; i < this.columnTable.Rows.Count; i++)
{
string fldName = this.ToInternalName((string)columnTable.Rows[i][0]);
string fldType = (string)columnTable.Rows[i][1];
cconAssign.Parameters.Add(new
CodeParameterDeclarationExpression(fldType, fldName));
cconAssign.Statements.Add(
new CodeAssignStatement(
new CodeFieldReferenceExpression(new
CodeThisReferenceExpression(), fldName),
new CodeArgumentReferenceExpression(fldName)));
}
return cconAssign;
}
PrimaryKey object
Each entity will have an implementation of the IPrimaryKey
interface. This allows the factory classes to have common interactions with the entity objects. In the design I actually use in production, my entity factories share a common abstract base class (as do the entities) that holds some common functions. In this simplified article, I have eliminated this, but the IPrimayKey
interface gives a clue as to how some of this works in a real implementation.
Below is our method that makes an entity-specific implementation of our IPrimaryKey
interface. It's a fairly simple class. The only methods, IPrimaryKey.IsNew
and IPrimaryKey.Clear
are fairly straightforward. If you look at the PKIsNew
and PKClearMethod
functions, the methods they generate are checking and resetting the internal primary key field to an "Uninitialized" state. This uninitialized value is picked from a set of defaults. In the case of Identity (integer) fields, the default is zero. Using a value of zero as the identity seed would obviously wreck havoc with this, so the uninitialized value would have to be adjusted accordingly. For character, string (varchar
, nvarchar
, etc.), and UniqueIdentifiers, this value is null.
CodeTypeDeclaration BuildPrimaryKeyClass()
{
CodeTypeDeclaration pkclass = new CodeTypeDeclaration();
pkclass.IsClass = true;
pkclass.Name = this.primaryKeyClassname;
pkclass.Attributes = MemberAttributes.FamilyAndAssembly;
pkclass.BaseTypes.Add("IPrimaryKey");
pkclass.Comments.AddRange(
this.SummaryComments("Implementation of primary key" +
" object for this entity"));
for(int i=0; i < this.columnTable.Rows.Count; i++)
{
if((string)columnTable.Rows[i][2]=="X")
{
string internalFldName =
this.ToInternalName((string)columnTable.Rows[i][0]);
string externalFldName = this.ToExternalName(internalFldName);
string propType = (string)columnTable.Rows[i][1];
pkclass.Members.Add(FieldVar(internalFldName,
propType, MemberAttributes.Private));
pkclass.Members.Add(this.SimpleProperty(externalFldName,
internalFldName, propType));
}
}
pkclass.Members.Add(this.PrimaryKeyDefaultConstructor());
pkclass.Members.Add(this.PrimaryKeyAssignmentConstructor());
pkclass.Members.Add(this.PKIsNew());
pkclass.Members.Add(this.PKClearMethod());
return pkclass;
}
Once again this implementation doesn't actually support compound primary keys, but it does provide for a common, field neutral interface for interaction with the primary key.
Building the Factory Code
The factory code is actually much simpler than the entity code. We only have two public methods and it's mostly a matter of spitting out the appropriate SQL. For the factory, we iterate the columns and use them for creation of:
FindByPrimaryKey
method (3 lines of code)
GatherAll
method (8 lines of code)
Sounds pretty simple, doesn't it. In fairness, there are two supporting functions totaling and additional 12 lines of code inclusively. The basic attack pattern is the same as above: work out the target code and methods, include statically named sections directly, and iterate the columns to generate your field references (or SQL strings in this case).
The class factory build method BuildFactoryClass
has much the same structure as the BuildClass
method, handling code generator and file stream creation, as well as writing the code. Each target method is represented by a separate build method returning a CodeMemberMethod
object. Originally, I had wanted these methods to be public static
. Unfortunately, the CodeType.Attributes
does not support multiple attributes through its MemberAttributes
enumeration. Nor does it support collections of attributes. So I had to settle on public
instance methods instead.
EntityFromTableRow
The first method we'll look at is the one that builds the method to convert a DataRow
into an entity object. It's really the workhorse of the factory class. You might expect this to be fairly difficult to write, but it actually wasn't too bad. We'll just look at a couple parts of it. Below is where we iterate the columns and build them out as arguments to our entity constructor.
CodeObjectCreateExpression objcreate = new CodeObjectCreateExpression();
objcreate.CreateType = new CodeTypeReference(this.entityName);
for(int i=0; i < this.columnTable.Rows.Count; i++)
{
string fldName = this.ToInternalName((string)columnTable.Rows[i][0]);
string fldType = (string)columnTable.Rows[i][1];
CodeMethodInvokeExpression param =
new CodeMethodInvokeExpression(new CodeSnippetExpression("Convert"),
this.ConversionMethodName(fldType),
new CodeExpression[]{new CodeSnippetExpression("row["
+ i.ToString() + "]")});
objcreate.Parameters.Add(param);
}
In the above code, you will notice a call to an internal method this.ConversionMethodName
. As we supply the arguments, we need to strongly type them. Back in the old classic ASP/VB COM days, when I first started playing around with entity code generators, this was an unnecessary step. Everything was a Variant
(generic object for you newcomers), so typed or not it just got slapped into a generic variable at the end of the day. In fact, if you did type the parameters to your COM objects, you'd have to explicitly cast all the arguments within the ASP code or you'd get errors from here to Kingdom COM(e) {snicker}. So now that we do have strong typing (a huge improvement, BTW) we have to convert everything to the proper type. Casting doesn't work because you can't cast a reference type (i.e. object
) to a primitive (ex: int
). One has to make use of the System.Convert
object methods. The method naming conventions are standardized well, so we can make a pretty good guess from the column data type.
string ConversionMethodName(string typeName)
{
string tname = typeName.Substring(typeName.LastIndexOf(".") + 1);
return "To" + tname;
}
GatherAll
The only other method we'll look at is the one to generate our GatherAll
factory method. To refresh, here is our target code:
public static ICollection GatherAll()
{
string sql = this.SimpleEntitySelect() + " FROM Simple ;";
DataTable table = DataAccess.ExecuteSqlSelect(sql);
ArrayList lst = new ArrayList();
for(int i = 0; i < table.Rows.Count; i++)
{
lst.Add(this.EntityFromTableRow(table.Rows[i]));
}
return lst;
}
So our target is eight lines of code (11 if you count the method declaration). We know from above that public static
is impossible with CodeDOM, so we'll just make it public
. I'll save the suspense and just give you the code, then we'll talk about it.
CodeMemberMethod GatherAllMethod()
{
CodeMemberMethod gatherMethod = new CodeMemberMethod();
gatherMethod.Name = "GatherAll";
gatherMethod.ReturnType = new CodeTypeReference("ICollection");
gatherMethod.Comments.AddRange(
this.SummaryComments("Returns a collection" +
" of all entities in the database"));
gatherMethod.Attributes = MemberAttributes.Public;
" FROM Simple WHERE SimpleID='" + key + "';";
gatherMethod.Statements.Add(
new CodeVariableDeclarationStatement(typeof(string), "sql",
new CodeMethodInvokeExpression(new
CodeThisReferenceExpression(),
"BaseEntitySelect",
new CodeExpression[]{})));
gatherMethod.Statements.Add(
new CodeSnippetExpression("sql += \" FROM " +
this.ToSqlName(this.tableName) + "\" "));
gatherMethod.Statements.Add(
new CodeVariableDeclarationStatement(
typeof(System.Data.DataTable), "table",
new CodeMethodInvokeExpression(new
CodeSnippetExpression("DataAccess"), "ExecuteSqlSelect",
new CodeExpression[]{new
CodeVariableReferenceExpression("sql")})));
gatherMethod.Statements.Add(
new CodeVariableDeclarationStatement("System.Collections.ArrayList",
"lst", new CodeObjectCreateExpression("System.Collections.ArrayList",
new CodeExpression[]{})));
CodeStatement[] iterationCode = new CodeStatement[]
{
new CodeExpressionStatement(
new CodeMethodInvokeExpression(new
CodeVariableReferenceExpression("lst"), "Add",
new CodeExpression[]{
new CodeMethodInvokeExpression(new
CodeThisReferenceExpression(),
"EntityFromTableRow",
new CodeExpression[]
{
new CodeArrayIndexerExpression(new
CodeMethodReferenceExpression(
new CodeVariableReferenceExpression("table"),
"Rows"), new CodePrimitiveExpression(0))
}
)})
)
};
gatherMethod.Statements.Add(
new CodeIterationStatement(
new CodeVariableDeclarationStatement(typeof(int),
"i", new CodePrimitiveExpression(0)),
new CodeBinaryOperatorExpression(new
CodeVariableReferenceExpression("i"),
CodeBinaryOperatorType.LessThan,
new CodeSnippetExpression("table.Rows.Count")),
new CodeAssignStatement(new CodeVariableReferenceExpression("i"),
new CodeSnippetExpression("i+1")), iterationCode));
gatherMethod.Statements.Add(
new CodeMethodReturnStatement(
new CodeVariableReferenceExpression("lst")));
return gatherMethod;
}
Soooooo, to generate 8 lines of code, CodeDOM requires... (throw out the white space, ignore the comments, carry the 1)... a whopping 47 lines of code!!! I counted 41 objects (count of objects by counting calls to "new
"). And it couldn't even be done without a few uses of the CodeSnippetExpression
class. To do the math, that's 500% bloat. Did I mention my template based solution didn't even look at this method since it was common enough to be part of the base class?
Conclusion
Building a code generator for one entity is a lot of work. A whole lot of work. But it's the same amount of work as building a code generator for twenty entities. Or two hundred. On the flip side, building one entity directly is a fair amount of work. Building two is a bit more. Building 20 is ridiculous, and building 200 would be ludicrous. My basic rule of thumb for anything repetitive is thus: If you're going to do it twice, do it manually. If you may do it three times, break it out or write a tool to automate it.
Final comments on CodeDOM
At this point the weaknesses of CodeDOM as a practical tool should be self-evident. By abstracting all calls and references, you get a language structure that, to put it kindly, is non-obvious. Target source code methods and patterns are difficult to maintain and make changes to when compared to template-based approaches where the code exists in the target format within the template. Complex methods or logic routines are all but impossible to write without just writing the code directly as a CodeSnippet
.
Another major problem with CodeDOM and how it generates code is its inability to have multiple attributes. I can see where the designers might want to restrict this so that users don't add mutually exclusive attributes to a given type (class, variable, etc.), but come on. Who hasn't needed a few public static
methods and variables now and then?
In exchange for these weaknesses, you have a powerful abstraction layer for the target language, but not powerful enough to completely divorce yourself from your language syntax (see all the CodeSnippet
objects used for relatively common code sequences). And you're confined to languages with CodeDOM support: i.e., VB.NET and C#.
After completing this stage of the project, I can say with the utmost confidence that I will not be using CodeDOM again in the near future. For the amount of bloat created by CodeDOM, I could hand code my generator to support three or four languages with a series of case
statements. All is not lost, however. There are some projects out there to wrap CodeDOM in a way to make it actually work somewhat as promised. The NGenerate project sounds promising and there are several XML/XSLT based projects floating around. You'll have to do the research yourself.
But now we have a code generation tool built and can move on to the next stage: Packaging and Subclassing for Rapid Refactoring. Additional "Fancy Features" to consider building on your own time might include integrated logging and performance counters, exception handling, partial field updates, and stored procedure autogeneration and support. If you want to get crazy, you could use sp_fkeys
to build entities with child-entity collections already included. Or not.
Preview Part III
In Part III, we'll show how to package and subclass your generated code for rapid refactoring. We'll code a simple web application and demonstrate how database schema changes can have little or no effect on your human-maintained code.