Introduction
In my article on loading any object from any database, I made a rather large list of shortcomings of that code. In this article, I am going to fix a few of those by using custom attributes. By the end of this article, we will have fixed, or be able to fix, the following:
- Being forced to use the poorly performing
SELECT *
- Inability to map table and column names that aren't valid variable or class names
- Using slow performing properties
- Inability to protect data
Background
In .NET, you have the ability to "decorate" your code, that is provide data about your code that you can get to through Reflection that will tell you more about your code. This sounds a little strange, but if you have ever used XML serialization, you have already seen decorations in code like this (which was borrowed from the MSDN website here):
public class Address
{
[XmlAttribute]
public string Name;
public string Line1;
[XmlElementAttribute(IsNullable = false)]
public string City;
public string State;
public string Zip;
}
Of course, reusing those decorations wouldn't really be useful; we need need to create our own. The XML serialization code does exactly what we did in the step 1 article. It uses the variable and class names for nodes, and only needs the attributes for exceptions. I want to use my custom attributes to limit which fields are used. Defaulting the fields would leave me missing information later.
For classes, we need an attribute that will give us a minimum of two pieces of information:
- The schema information, if there is any
- The table name
For columns, we will need at least two pieces of information:
- The name of the column in the table
- The
DbType
of the column (for creating parameters, though I won't be touching on it in this article)
Since this article isn't about creating or reading custom attributes, I will let you research that on your own. You can find a couple of tutorials here and here.
I have chosen to enforce a couple of rules on my TableAttribute
class, including only allowing the attribute on a class, not on a property or field, and only allowing a single instance of the attribute on the class. So, here is the code for this:
[AttributeUsage(AttributeTargets.Class, AllowMultiple = false)]
public sealed class TableAttribute : Attribute
{
public string Schema { get; set; }
public string Name { get; set; }
}
To avoid the inherent speed problem with using properties, and because I would like to be able to use properties to track changes, I chose to make the column attributes only valid for fields (instance variables). This gives me the ability to protect my data from accidental changes, because I can eliminate the 'set' portion of the property for columns that aren't allowed to be changed. Data protection is a good thing!
[AttributeUsage(AttributeTargets.Field, AllowMultiple = false)]
public sealed class ColumnAttribute : Attribute
{
public DbType DbType
{
get;
set;
}
public String Name
{
get;
set;
}
}
Note: For those who don't know there is something a little weird with creating your own attributes, you are always supposed to name them with the suffix Attribute, hence my two attributes are TableAttribute
and ColumnAttribute
. The weird part is that when you start using them to decorate your code, the "Attribute" portion of the name is dropped, so it will be Table
and Column
.
If we look at my article on passwords or my article on SQL Server indexed views, you couldn't use those tables with the code from my step 1 article. However, if we add the new attributes, the problematic table Vehicle can have its object look like this:
[Table(Schema = "dbo", Name = "Vehicle")]
public class Vehicle
{
[Column(DbType = DbType.Guid, Name = "ID")]
private Guid _id;
[Column(Name = "Vehicle Identification Number",
DbType = DbType.String)]
private string _vehicleIndentificationNumber = string.Empty;
public Guid ID
{
get { return _id; }
set
{
if (value != _id)
{
_id = value;
}
}
}
public string VIN
{
get { return _vehicleIndentificationNumber; }
set
{
value = (value ?? string.Empty).Trim();
if (string.Compare(value, _vehicleIndentificationNumber) != 0)
{
_vehicleIndentificationNumber = value;
}
}
}
}
As you can see, I gave a hint as to how to implement tracking if the object is dirty. I also used a shorter name for a column, the more familiar VIN, rather than Vehicle Identification Number.
Now, to pull only the columns we need from our table, instead of using SELECT *
. To do this, I will do the following:
- Query all the fields, check if they have my
ColumnAttribute
, if they do, what the column name is.
- Wrap the column names using the
CommandBuilder
function QuoteIdentifier
.
- Get the schema and table name from the
TableAttribute
and wrap them as well.
So, let's build the function to do just that.
private static BindingList<T> LoadObjectsFromDatabase<T>
(string connectionString, string providerName) where T : new()
{
var factory = DbProviderFactories.GetFactory(providerName);
var commandBuilder = factory.CreateCommandBuilder();
Type type = typeof(T);
string name = string.Empty;
{
var attributes = type.GetCustomAttributes(false);
foreach (Attribute attribute in attributes)
{
TableAttribute ta = attribute as TableAttribute;
if (ta != null)
{
if (!string.IsNullOrEmpty(ta.Schema))
{
name = string.Format("{0}{1}",
commandBuilder.QuoteIdentifier(ta.Schema),
commandBuilder.SchemaSeparator
);
}
name += commandBuilder.QuoteIdentifier(ta.Name);
break;
}
}
}
if (string.IsNullOrEmpty(name))
{
throw new
Exception("Unable to build SQL because " +
"custom attribute doesn't exist.");
}
var fieldMap = new Dictionary<string, FieldInfo>();
string comma = string.Empty;
var sb = new StringBuilder("SELECT ");
var fields = type.GetFields(BindingFlags.NonPublic| BindingFlags.Instance);
foreach (var f in fields)
{
var attributes = f.GetCustomAttributes(false);
foreach (Attribute attribute in attributes)
{
ColumnAttribute ca = attribute as ColumnAttribute;
if (ca != null)
{
fieldMap.Add(ca.Name, f);
sb.Append(comma);
comma = ", ";
sb.Append(commandBuilder.QuoteIdentifier(ca.Name));
}
}
}
if (fieldMap.Count < 1)
{
throw new Exception("No columns in query");
}
sb.AppendLine();
sb.AppendLine("FROM");
sb.Append(name);
using (var connection = factory.CreateConnection())
{
connection.ConnectionString = connectionString;
connection.Open();
using (var command = connection.CreateCommand())
{
command.CommandText = sb.ToString();
command.CommandType = CommandType.Text;
using (var dr = command.ExecuteReader(CommandBehavior.CloseConnection))
{
return LoadObjectsFromDataReader<T>(dr, fieldMap);
}
}
}
}
This code is broken down in to three basic sections, which if I were building a full blown solution would be better organized for code reuse.
Portion 1
This section is dedicated to getting the custom attribute off the class, so we know what the table name is.
Portion 2
This is dedicated to getting the list of fields from the object and getting the custom attributes off them. I use those pieces of information to build a Select
statement that selects only the columns we actually have fields into which to store the information. This is a huge benefit over SELECT *
since it will execute faster, columns being reordered won't affect it, and neither will new columns in the table. I also store the column name and the field info for later use. This section finishes up by appending the FROM
clause to the SQL statement. Note: Using SELECT *
forces your database engine to expand the *
into the list of column names. Thus, rebuilding this SQL statement every time you need it would cause you to lose the speed boost you get from your database engine for not using SELECT *
. You should cache your freshly built statement, but I am not going to touch on that yet.
Portion 3
This should be reduced to just a few lines. First, a single call that executes the SQL statement. Then, passing the DbDataReader
to the modified LoadObjectsFromDataReader
function. As it stands, it uses the provider string and connection string to build a new connection to the database, making sure to set the CommandBehavior
to CloseConnection
; that way, the connection closes automatically when the data-reader no longer has any records, or is disposed.
Now, we need to modify the existing LoadObjectsFromDataReader
function from the step 1 article.
private static BindingList<T> LoadObjectsFromDataReader<T>
(IDataReader dr, Dictionary<string,
FieldInfo> fieldMap) where T : new()
{
BindingList<T> retval = new BindingList<T>();
object[] oo = new object[fieldMap.Count];
while (dr.Read())
{
dr.GetValues(oo);
T item = new T();
int fieldIndex = -1;
foreach (var kvp in fieldMap)
{
FieldInfo fi = kvp.Value;
object o = oo[++fieldIndex];
if (DBNull.Value.Equals(o))
{
o = null;
}
try
{
fi.SetValue(item, o);
}
catch
{
}
}
retval.Add(item);
}
return retval;
If you compare this to the original function, you see that I am no longer querying the object for a list of properties. Instead, I am using my map of column names to fields, and because I know that my DbDataReader
and my fieldMap
are in sync, I can simply iterate through them, filling the fields in as I go.
Using the Code
Using this code is easy. As I stated in the step 1 article, Linq2Objects is a great way to get data out of disparate data sources, so while this example is using only data from a single database, it doesn't matter; once they are loaded, we can query them all we like.
var vehicles = LoadObjectsFromDatabase<Vehicle>
(connectionString,
providerName);
var vehicleLicensePlates =
LoadObjectsFromDatabase<Vehicle_LicensePlate_Issuer>
(connectionString,
providerName);
var licenseIssuers = LoadObjectsFromDatabase<LicensePlate_Issuer>
(connectionString,
providerName);
var licenses = LoadObjectsFromDatabase<LicensePlate>
(connectionString,
providerName);
var query = from license in licenses
join lincensePlateIssuer in licenseIssuers
on license.ID equals lincensePlateIssuer.LicensePlateID
join vehicleLicensePlate in vehicleLicensePlates
on lincensePlateIssuer.ID equals vehicleLicensePlate.LicensePlate_Issuer_ID
join vehicle in vehicles
on vehicleLicensePlate.Vehicle_ID equals vehicle.ID
where vehicle.VIN == "asdf" && vehicleLicensePlate.Current
select license;
This example simply loads a number of objects from my example database (the example database from that article contains data), and returns the one record that is meaningful for this query.
Points of Interest
This article addresses all of the shortcomings mentioned at the top, but I still haven't touched on the actual caching of queries, or using the custom attributes to create Insert
or Update
statements (though you could use the CommandBuilder
to help with that).
The single largest glaring shortcoming of this code is that there is no way to limit your data to a subset of rows, in other words, I have no "WHERE
" clause.
Before the band starts tuning up, let's look at some of the issues with this code. The DbDataReader
and fieldMap
could be out of sync, that should be checked and isn't. The LoadObjectsFromDatabase
function should probably take just the connection name, rather than having to pass multiple values. The CommandBuilder.QuotePrefix
and QuoteSuffix
aren't always filled in, thus you have to make sure they are right before you use QuoteIdentifier
. Not every database supports schemas, or if they do, they may not support them properly. In some cases, it is easier to ignore that; however, sometimes it is not an option. For example, if you are using a database that is designed like the new AdventureWorks demo database from Microsoft, schemas are absolutely required.
While there are still major shortcomings to this code, this code can now pull back any object from any database, and this includes the new types from SQL Server 2008. Obviously though, if you are using those database specific types, you won't be able to swap out database servers underneath your code. However, simply changing the connection string (and possibly the schema) will allow you to move your tables of data from one database to another.
Another great thing is, assuming you case your Name
strings correctly, this code allows you to map a column to any variable name, it doesn't even have to be meaningful. Those characters like spaces and SQL keywords won't break your code.
History
- 2010-02-18: Fixed bug in
LoadObjectsFromDataReader
(changed field-index from post increment to pre-increment).
- 2010-02-17: Original article.