Design and Implementation of an Attribute-Driven, Caching Data Abstraction Layer

owen654321

4.98/5 (25 votes)

21 Jul 2008CPOL30 min read

595

An easy-to-use, attribute-driven data abstraction layer with multi-database support, intelligent caching, transparent encryption, multi-property sorting, property change tracking, etc.

Download source code and demo project - 238 KB

Introduction

This article presents a data abstraction layer (DAL) designed to:

Reduce or eliminate the use of hardcoded SQL statements. SQL queries, updates, inserts, and deletes should be generated by the data abstraction layer when needed.
Alleviate the need for developers to write repetitive, database-specific code.
Use intelligent data caching to reduce chatter between the code and the database engine, improving application speed.
Enable migration between database platforms (e.g., Microsoft SQL Server, MySQL) without requiring significant modifications to existing code.
Create a single point of access for code-database interaction, reducing the risk of poor coding practices that could lead to connection leaks or vulnerability to SQL injection attacks.
Provide transparent, basic string mangling and encryption routines to protect database content from prying eyes.

Background

My initial goal in designing and implementing this data abstraction layer (DAL) was to make my job - helping design and code a large website from the ground up - easier. Designed to be a simple and light-weight library with a limited feature set, the end product turned out to be a much larger beast than originally planned, both in terms of size and functional scope. I attribute the difference between the project's original and current design to a continuous stream of feature requests (not only from the client, but also from the other developer on the project) and to constantly-changing requirements during development. Still, creating this data abstraction layer solo was a great learning experience, and quite exciting at times ("testing" a new version involved uploading it to the heavily-trafficked (production) website for immediate testing - trial-by-fire-style). In other words, this was one of those rare "no program managers, screw the unit tests, specs-be-damned, design documents are a waste of time" kind of projects. Ahh, fun times.

Data Abstraction Layer - Basic Definition

A data abstraction layer (DAL) is a library used to facilitate communication between code and database. In code, the developer issues all query, insert, update, and delete requests to the DAL - not directly to the database. By providing a standard interface to expose database-related functionality, the DAL hides the complexity associated with code-database communication, and at the same time, eliminates the need for developers to write database-engine specific data access code (e.g., SQL Server uses SqlDataReader, MySQL uses MySqlDataReader; SQL Server supports named parameters - @firstname, OLEDB and MySQL use ? as parameter placeholders).

How it Works (and Implementation Options)

The data abstraction layer is responsible for generating Query, Update, Insert, and Delete statements as needed, and populating class property values using data returned from the database. Therefore, the data abstraction layer needs to be able to translate between table names and class types (i.e., table person stores data representing instances of class Person), and database columns and class properties (i.e., column first_name maps to class property FirstName). The way in which data abstraction layers address this requirement varies from implementation to implementation. Several DAL implementations use an XML or other data file to supply the mapping between classes/properties and tables/columns. Other implementations read table schemas directly from the database, and use the schema information to generate class definitions (either during a pre-compilation step, or dynamically at runtime). Still, other DAL implementations - this one included - use custom attributes to associate table names with class types and database columns with class properties. The figure (above right) shows a DAL-enabled class decorated with custom attributes.

In the Beginning...

In the beginning, there were two custom attributes: the class attribute TableNameame associated a class type with a database table, and the property attribute MappedToColumn associated a property with its representative column in the database.

After creating the two attributes, I finished the project by writing methods that used Reflection to read the attribute values and auto-generated Update, Insert, and Delete statements. Piece of cake...

Piece of cake, except - the developer I was working with (also the database designer) was accustomed to embedding hard-coded SQL statements directly into code wherever and whenever data access was required. Frequently, he felt restricted by our use of the data abstraction layer (I must admit that during the initial stages of development, I also was tempted to go back to plunking massive, hardcoded SQL statements directly into web page code-behinds). When he felt this way, he would say something to the effect of, "Trying to use the DAL to do X is really frustrating, and it is easy to do with a SQL statement... so what's up with your library?" These bug reports / complaints were almost always cases where he was attempting to use the DAL well outside of its original design scope. In hindsight, these complaints were the feature requests which continuously drove development forward.

(Thus began the first era of feature creep...)

Data Abstraction Layer - Example Use

Before getting into the nitty-gritty details of implementation, I'd like to take a minute to:

provide a broad overview of how existing classes can be modified to become DAL-enabled, and
present several code snippets that demonstrate the use of DAL-related functionality.

To use the data abstraction layer, existing C# or VB.NET classes are decorated with instructive attributes, and must inherit from the DAL-enabling generic base class, DBLayer<T>. By applying these basic decorative attributes and modifying the class to inherit from DBLayer<T>, existing .NET classes can be quickly transformed into fully database-backed, queryable entities. Several examples that demonstrate the use of DAL-related functionality are presented below:

The data abstraction layer handles how and when queries are issued; the developer is free to focus on writing readable, logical code. For example, retrieving a "Person" entity by a unique ID resembles a standard dictionary lookup:

// Find a specific person by id
Person p = Person.Find[19387];

Abstraction of logic allows the programmer to use the same code, whether querying the database or retrieving results from in-memory cache. For example, using the OrderBy class allows collections to be sorted using multiple properties; when the results are retrieved from the database, a SQL ORDER BY statement is used; when results are retrieved from cache, in-memory multi-property sorting is used.

// Retrieve all People, sorting/grouping by last name,
// then (within identical last names) by first name:
OrderBy<Person> orderByLastFirstName = 
  new OrderBy<Person>("last_name").AddOrderBy("first_name");

List<Person> orderedPeople = Person.DALQuery(orderByLastFirstName);

Intelligent caching is used to reduce chatter with the database. In the example below, if the code is executed after running example #2 (above), results will be retrieved from in-memory cache; no database query is issued (unless the in-memory cache has been invalidated by an Insert, Update, or Delete performed on a Person entity prior to execution of the following statement):

// View all people with first name of "Fred"
// whose information has been updated in the past 30 days.
// Note: with caching enabled, this query
// will not hit the database - results will be retrieved
// from cache (which was conveniently populated
// during execution of the above statement)

List<Person> people = WorkOrder.DALQuery_List(
    (Col_FirstName == "Fred") & 
    (Col_LastUpdated > DateTime.Now.AddDays(-30)));

Identity/AutoIncrement values are auto-populated as soon as a new entity is inserted into the database using the Insert() method.

// Create and insert a new Person entity - the autoincrement
// value is automatically populated after insert
Person p = new Person() { FirstName = "Owen", LastName = "Emlen" };
Person.Insert(p);

Console.WriteLine("The identity of the newly inserted Person is " + 
                  p.person_id.ToString());

Data Mangling and Encryption-related attributes can be attached to properties in order to transparently mangle or encrypt their values (before being written to the database), and to de-mangle or decrypt values (after being retrieved from the database). The end results are property values that can be read from (and written to) as usual in code, but appear obfuscated when viewed in the database.

Accessing sensitive data via code:

The same data, viewed in the database:

Implementation Details

As mentioned in the Background section, the data abstraction layer relies on custom attributes attached to class definitions and properties to translate between references to code objects (classes, properties) and references to database objects (tables, columns). Translating property names into column names allows the data abstraction layer to transparently generate Query, Update, Insert, and Delete statements. Translating column names into property names allows the DAL to populate the appropriate class properties using values returned from a database query.

TableName(string tableName) - Attached to a class definition, this attribute specifies the database table name associated with the class. If no table name is specified, the table name is assumed to be the same as the class name.
MappedToColumn(string columnName) - Attached to a property, this attribute specifies the database column name associated with this property. If no column name is specified, the column name is assumed to share the same name as the property.

Encapsulating Search Logic: Class DBWhere

The TableName and MappedToColumn attributes handle the "vocabulary" portion of the translation, allowing the data abstraction layer to reference both classes and properties (in code) and tables and columns (in the database). However, to communicate with both the code and the database, the data abstraction layer must also be fluent in the grammars and syntaxes used to communicate logical requests.

For example, in code, a for loop and comparison operators can be used to find and retrieve specific items in a collection. Finding and retrieving similar items in the database involves writing a WHERE statement that specifies which items should be retrieved. Although the syntax used to retrieve items from an in-memory collection differs from the syntax required to retrieve items from a database, the underlying logic that specifies how to identify the items we are searching for is syntax-invariant. I created the DBWhere class in an attempt to provide a declarative, syntax-invariant way for the developer to send unambiguous search logic directly to the data abstraction layer. Using the syntax-invariant logic represented by an instance of DBWhere, the data abstraction layer either conveys the search logic to the database via the construction of a SQL WHERE statement, or executes appropriate search-related code to locate specific items in an in-memory collection.

There are several benefits associated with using DBWhere to represent search/query logic in the data abstraction layer. First, by using a class to encapsulate the query logic (instead of a SQL-like query string or multiple procedural statements), search logic can be quickly, accurately, and uniquely identified and categorized. By uniquely identifying search logic, the data abstraction layer can determine whether results can be retrieved from an in-memory cache or whether it is necessary to issue a database query to retrieve the results. Additionally, by categorizing portions of search logic, the data abstraction layer can identify frequently used logic and build in-memory indices to optimize search performance. Another benefit of using DBWhere to encapsulate search logic is that widely-used logical definitions (i.e., definition of an active user) can be defined at the application level, encouraging shared, standardized use of the definition. For instance, placing the logic that defines an "active user" in a global module for shared use helps ensure that the definition of an "active user" will not vary from module-to-module throughout an application. Plus, if a change in logic is required (the definition of an "active user" is revised), the change can be made in a single location, and will automatically be reflected in all modules that reference the logic.

public partial class UserInfo : DBLayer<UserInfo>, 
               IDBLayer<UserInfo>, INotifyPropertyChanged{
.....
   // Active User Definition/Logic:
   // 1. Admins are always considered active
   // 2. If not an admin, user must be verified, not banned, 
   //    and must have logged in once within the past 90 days

   public static readonly DBWhere ActiveUserDefinition =
        (Col_security == SecurityLevel.Admin) |
        (Col_verified == true & Col_banned == false & 
         Col_last_login > DateTime.Now.AddDays(-90));

   public static List<UserInfo> ReadAllActiveUsers()
   {
      return DALQuery_List(ActiveUserDefinition);
   }...
}

Defining Search Logic Using DBWhere

The DBWhere search logic can be constructed using several methods:

By explicitly creating an instance of the DBWhere class, using the constructor and the class methods to specify the search logic.
By using the DALColumn object, built-in operator overloads, and implicit casting to construct a 'friendly-looking' DBWhere logic statement (an idea I borrowed from this CodeProject article by Chad Z. Hower aka Kudzu).

// Explicit creation of DBWhere logic using the constructor and class methods
DBWhere where = new DBWhere("FirstName", "Fred").AND("LastName", "Flintstone");
List<Person> results = DALQuery(where);

// Construction of a 'friendly-looking' DBWhere logic statement
List<Person> results = DALQuery(Col_FirstName == "Fred" & 
                   Col_LastName == "Flintstone");

DBWhere Internals

Internally, each instance of DBWhere contains a collection of match requirements/logic, stored in a collection of PropNameAndValues. Each PropNameAndValue instance contains a property name (i.e., "FirstName"), a value to which the specified property value will be compared (i.e. "Fred"), and a comparison type that specifies the type of comparison to be performed (i.e. "="). Multiple instances of DBWhere can also be combined with other DBWhere statements using various combination logic (AND, OR, AND NOT, OR NOT, etc.). When combined, a collection of linked DBWhere statements form a logic tree (with nodes as DBWhere objects, connected by links that represent a specific combination logic).

Class diagram for DBWhere.

Translating DBWhere Logic into a SQL WHERE Statement

In order to effectively express search logic in a database query, a SQL WHERE statement that represents the logic contained in the DBWhere object must be created. At the most basic level, the method BuildAndFromPairs builds a WHERE-compatible SQL statement using the match requirements specified by a single DBWhere instance. As an example, if a DBWhere object containing two requirements (FirstName == "Fred" and LastName == "Flintstone") is passed in as a parameter, method BuildAndFromPairs will return the SQL-statement-equivalent that can be used to identify records meeting both requirements (first_name='Fred' AND last_name='Flintstone'). The method implementation is presented below (if you want a closer look, want to know how called methods are implemented, or want to see how DBWhere statements are combined using different combination logic, open up DALToSQL.cs in Visual Studio):

/// <summary>
/// Takes basic AND pair requirements
/// (PropertyName (logic) Value) and creates a SQL AND statement
/// </summary>        
/// <param name="parametersOut">
/// This parameter is used to output SqlParameters if the query is parameterized.
/// For instance, if a pair contains parameter
/// @CustID with value 42, then parametersOut will contain 
/// the parameter value 42 and associate it with the token @CustID
/// </param>
/// <returns>
/// A StringBuilder containing a SQL logic statement
/// appended to the existing contents of sb
/// </returns>
protected static StringBuilder BuildAndFromPairs(
   StringBuilder sb, PropertyNameAndValueCollection pairs, 
   ref List<SqlParameter> parametersOut)
{
   bool wroteAnything = false;

   foreach (PropNameAndVal pair in pairs._propertyNameAndValuePairs)
   {
      PropertyDescriptor propDescriptor = 
         CachedAttributes.GetPropertyDescriptor(pair._propertyName);

      if (propDescriptor == null)
      {
         throw new Exception(
            String.Format("Property name {0} associated with class {1} not found!",
               pair._propertyName, ClassType.Name));
      }

      string columnName = 
        CachedAttributes.GetColumnNameForProperty(propDescriptor);

      // only generate SQL for properties
      // that have corresponding database column names
      if (columnName != null)
      {
         SqlDbType sqlType = CachedAttributes.GetSqlDataType(propDescriptor);

         // Some SQL types we cannot use in queries
         if (sqlType == SqlDbType.Text || sqlType == SqlDbType.NText ||
             sqlType == SqlDbType.Binary || sqlType == SqlDbType.Image) continue;

         if (!wroteAnything)
         {
            // first time open parens
            wroteAnything = true;
            sb.Append("(");
         }
         else
         {
            // subsequent times AND together statements
            sb.Append(" AND ");
         }

         if (pair._nonStandard)
         {
            // For queries that currently cannot be handled using SQL, insert a placeholder
            // that always evaluates to true: <large_unique_number> = <large_unique_number>
            int u = (pair._propertyName + pair._value.ToString()).GetHashCode();
            sb.Append("(" + u.ToString() + "=" + u.ToString() + ")");
            continue;
         }

         object comparisonValue = pair._value;
         string sqlComparison;

         if (pair._isParameterized)
         {
            // Handle parameterized queries                  
            if (parametersOut == null) parametersOut = new List<SqlParameter>(4);
            Type propType = CachedAttributes.GetPropertyType(propDescriptor);

            // Handle string mangling or encryption, if specified
            if (propType == typeof(string))
            {
               if (comparisonValue != null)
               {
                  string formattedValue = 
                     PadAndTrim(propDescriptor, comparisonValue.ToString());
                     
                  bool wantMangle = WantManglePropertyValue(propDescriptor);
                  bool wantRijindael = WantUseRijindaelProperty(propDescriptor);

                  // Possibly Mangle/Encrypt the string value before writing out
                  if (wantMangle)
                     formattedValue = ManglingProvider.Encrypt(formattedValue);

                  if (wantRijindael)
                     formattedValue = EncryptionProvider.Encrypt(formattedValue);

                  comparisonValue = formattedValue;
               }
            }

            if (_dataTransport == DataTransport.SqlDirectConnection)
            {
               // Use Sql named parameters
               SqlParameter sqlparam = 
                  new SqlParameter(pair._parameterName, comparisonValue);
                  
               parametersOut.Add(sqlparam);

               // Assume parameter names are passed in as @name
               sqlComparison = pair._comparisonOperator + pair._parameterName;
            }
            else
            {
               // OleDb does not support named parameters... 
               // we must use the column names and ?'s instead
               SqlParameter sqlparam = 
                  new SqlParameter(pair._propertyName, comparisonValue);
                  
               parametersOut.Add(sqlparam);
               sqlComparison = " " + pair._comparisonOperator + " ?";
            }
         }
         else
         {
            // No parameterization
            // Determine the comparison operator to use
            string comparison = (pair._logicNOT) ?
               pair._comparisonOperator.ReverseSQLComparisonOperator() :
               pair._comparisonOperator;

            sqlComparison = GetSQLComparison(
               propDescriptor, pair._value, comparison);
         }

         sb.Append(columnName);
         sb.Append(sqlComparison);
      }
   }

   // Close parens if we wrote anything to sb
   if (wroteAnything)
      sb.Append(")");

   return sb;
}

Querying an In-Memory Collection using DBWhere

The QueryResultSet method uses DBWhere-specified logic to locate matching results in an in-memory collection. For those interested, the implementation of the QueryResultSet method can be found in DAL\DALQueryInMemory.cs.

Populating DAL-Enabled Classes using Database Query Results

The SQLLoadRows method issues a database query and then populates a strongly typed ResultSet using the returned query results. The method is quite lengthy; for those interested in its implementation, it can be found in DAL\DatabaseSupport\DALSqlServer.cs.

Design notes and suggestions for future improvement:

At runtime, DAL instantiates a concrete implementation of the IDALDatabase interface to handle all database-engine-specific communication. The IDALDatabaseClassName class attribute is used to specify the name of the IDALDatabase-implementing class to instantiate at runtime.

The solution includes two implementations of this interface:

DALSqlOleDb, which uses OLEDB-specific objects (i.e., SqlConnection) to communicate with the database, and
DALSqlServer, which uses SQL Server-specific objects (i.e., SqlConnection) to communicate with the database.

The use of the IDALDatabase interface allows developers to easily switch between different back-end database engines. For example, if you are writing a large web application (using SQL Express as your database engine), and your boss informs you a week before going live that he won't pay for a full MS SQL Server license, there's no need to panic. The next day, you download and install the MySQL database engine, migrate your database schema, change the value of your IDALDatabaseClassName class attributes to "DALMySql", and tweak your connection string. You're done - no additional code modifications are required. (If you are interested in a MySQL implementation of IDALDatabase, let me know; I chose to omit it from the solution because of its dependency on the MySQL-specific .NET data provider, which is an open source project). Note: If the IDALDatabaseClassName attribute is missing, the data abstraction layer uses the built-in class, DALSqlServer, to handle communication with the database.

Currently, the DAL expects that SQLLoadRows will both:

issue a database query, and
populate a strongly-typed ResultSet using the returned query results.

In the future, I'd like to determine if task #2 can be generalized and extracted from the database-engine-specific SQLLoadRows method.

Avoiding Repetitive Use of Reflection

In the code I've presented so far, you may have noticed statements like CachedAttributes.GetColumnNameForProperty(propDescriptor) and CachedAttributes.GetSqlDataType(propDescriptor). These methods retrieve a custom attribute value associated with a specific property. Repeatedly reflecting on properties to locate custom attributes is slow and inefficient; therefore, these methods have been carefully implemented to use Reflection to read the custom attribute value the first time the attribute value is requested. After the initial use of Eeflection, subsequent retrieval of the custom attribute value is handled via an in-memory (dictionary) lookup.

// Using reflection every time we want
// to retrieve the value of a MappedToColumn attribute
// attached to a property would be inefficient and slow...

object[] objs = propInfo.GetCustomAttributes(typeof(MappedToColumn), true);

The AttributeLookup class was created in an attempt to ease the process of initially retrieving custom attribute values using Reflection, and then satisfying subsequent requests using a dictionary lookup. Example use:

AttributeLookup<string, LimitTextLength, T> _maxFieldLength = 
  new AttributeLookup<string, LimitTextLength, T>();
    
// Initially uses reflection to locate the LimitTextLength
// attribute associated with the property FirstName
LimitTextLength attrib = _maxFieldLength.Find("FirstName");
...

// The request for the LimitTextLength attribute
// associated with property FirstName is now handled 
// via dictionary lookup
LimitTextLength attrib = _maxFieldLength.Find("FirstName");

The implementation of the AttributeLookup class is presented below:

/// <summary>
/// Provides common code for fast dictionary lookups of an attribute by key, 
/// with dictionary lookup populated on demand
/// </summary>    
public class AttributeLookup<TKey, TAttribute, TClass>
   where TAttribute : Attribute
   where TClass : DBLayer<TClass>, IDBLayer<TClass>, new()
{
   // Dictionary used to quickly find a custom attribute
   // associated with a property by lookup key
   private Dictionary<TKey, TAttribute> _lookup;

   public AttributeLookup()
   {
      _lookup = new Dictionary<TKey, TAttribute>();
   }

   public TAttribute Find(TKey key)
   {
      TAttribute val;

      // Attempt to retrieve the attribute using a fast dictionary lookup
      if (_lookup.TryGetValue(key, out val) == true) { return val; }
      else
      {
         // Find the property associated
         // with the specified key (usually property name)
         PropertyInfo propInfo = 
           DBLayer<TClass>.ClassType.GetProperty(key.ToString());
         return Find(key, propInfo);
      }
   }

   public TAttribute Find(TKey key, PropertyInfo propInfo)
   {
      TAttribute val;

      // Attempt to retrieve the attribute using a fast dictionary lookup
      if (_lookup.TryGetValue(key, out val) == true) { return val; }
      else
      {
         // Not found, get custom attributes
         // of type TAttribute associated with the property
         val = (propInfo == null) ? null : 
                propInfo.GetCustomAttribute<TAttribute>();

         // Add the attribute to the dictionary.
         // Use try/catch to handle race conditions, avoiding lock()
         try { _lookup.Add(key, val); }
         catch { }

         return val;
      }
   }
}

Solution Contents and Description

The solution file contains the following projects:

DAL	Data Abstraction Layer implementation - base and support classes.
DALSampleApplication	Demo Windows Forms application - example hookup and use of several DAL-enabled classes
CreateDALDBObjects	SMO-enabled routines that create a SQL database and tables from existing, attribute-decorated DAL classes.
SQLToDALLibrary	Code generation library, used in SQLToDALStandalone to help convert database tables to DAL-enabled classes.
SQLToDALStandalone	Code generation program used to create DAL-enabled classes from database table schemas. (See Using the DAL: Step by Step Instructions for more info on this program).
HyperPropertyDescriptor	Marc Gravell's HyperPropertyDescriptors Library - speeds up `PropertyDescriptor` Set/Get by a factor of 5-10x+.
EventArgLibrary	Defines and implements several convenient, generic `EventArgs`-derived classes.
Extensions	Useful extension methods.
ThreadSafeObjects	Easy-to-use, efficient, thread-safe collection classes.

The Demo Application

The demo application implements several DAL-enabled classes - Person, UserInfo, Address, and SecretInfo. The UI lets the user add, modify, and delete data within the person/address/secret info grids. The application was built to demonstrate DAL-related concepts such as how to implement cascading deletes, issue free-text queries, and the use of moderately-complex DBWhere logic (finding all other people with a primary address in the same zip code). A diagram of the DAL-enabled classes is shown below:

The fully decorated, DAL-enabled Person class is defined below:

[TableName, CollectionCacheable, PrepopulateCache(false)]
[IDALDatabaseClassName("DALSqlServer"), 
 ReadConnectionStringFromMethod("ReadConnectionString")]
public partial class Person : DBLayer<Person>, 
               IDBLayer<Person>, INotifyPropertyChanged
{
  private long _person_id;
  private long _user_id;
  private string _first_name = String.Empty;
  private string _last_name = String.Empty;

  /// <summary>
  /// Demonstrates another way to supply DAL with a connection string (at runtime)
  /// </summary>    
  public static string ReadConnectionString()
  {
    return DALSampleApplication.Properties.Settings.Default.ConnectionStringToUse;
  }

  [MappedToColumn, Identity, PrimaryKey, QuickLookup]
  [OneToMany(typeof(Address), "person_id"), 
   OneToMany(typeof(SecretInfo), "person_id")]
  public long person_id
  {
    get { return _person_id; }
    set { Set<long>(ref _person_id, "person_id", value); }
  }

  [MappedToColumn, JoinsTo(typeof(UserInfo))]
  public long user_id
  {
    get { return _user_id; }
    set { Set<long>(ref _user_id, "user_id", value); }
  }

  [MappedToColumn, NameOfObject, LimitTextLength(100)]
  public string first_name
  {
    get { return _first_name; }
    set { SetString(ref _first_name, "first_name", value); }
  }

  [MappedToColumn, LimitTextLength(100)]
  public string last_name
  {
    get { return _last_name; }
    set { SetString(ref _last_name, "last_name", value); }
  }

  /// <summary>
  /// Returns the associated UserInfo for this Person.
  /// Using JoinNonNull guarantees that 
  /// if there is no UserInfo associated with this Person,
  /// a new (blank) UserInfo object will be returned.
  /// </summary>
  public UserInfo UserInfo { get { return JoinNonNull<UserInfo>(); } }

  /// <summary>
  /// These are passthrough properties that reference
  /// properties from the Person's associated UserInfo record.
  /// </summary>
  public string LoginName { get { return UserInfo.login_name; } }
  public string Password { get { return UserInfo.password; } }

  /// <summary>
  /// Gets all addresses associated with this Person
  /// </summary>
  public List<Address> Addresses { get { return JoinMultiple<Address>(); } }

  /// <summary>
  /// Gets the 'secret information' records associated with this Person
  /// </summary>
  public List<SecretInfo> SecretInformation { get { return JoinMultiple<SecretInfo>(); } }

  /// <summary>
  /// FirstName + LastName
  /// </summary>
  public string FullName { get { return String.Format("{0} {1}", 
                           first_name.Trim(), last_name.Trim()).Trim(); } }

  /// <summary>
  /// Returns a comma-delimited list of all other people
  /// who have a primary address near this person (same zipcode)
  /// </summary>    
  public string NamesOfOtherPeopleWithPrimaryAddressInSameZipcode()
  {
    List<Person> otherPeopleInZip = OtherPeopleWithPrimaryAddressInSameZipcode;
    return otherPeopleInZip.BuildCommaDelimited<Person>(delegate(Person p)
           { return p.FullName; });
  }

  /// <summary>
  /// Returns the Person's primary address,
  /// or null if there is no primary address for the Person
  /// </summary>
  public Address PrimaryAddress 
    { get { return Address.FindPrimaryAddressFor(person_id); } }

  /// <summary>
  /// Find other people with a primary address near this person (same zipcode)
  /// </summary>
  public List<Person> OtherPeopleWithPrimaryAddressInSameZipcode
  {
    get
    {
      string primaryZipcode =
        (PrimaryAddress == null) ? string.Empty : PrimaryAddress.zip;

      return Address.PeopleWithPrimaryAddressInZipcode(primaryZipcode, 
                                                       person_id);
    }
  }

  /// <summary>
  /// Overridden Delete() - cascading delete:
  /// 1. Cleans up all addresses associated with the Person
  /// 2. Cleans up all secreted information associated with the Person
  /// 3. Calls the base DBLayer.Delete to delete the Person entity
  /// </summary>
  /// <returns></returns>
  public override string Delete()
  {
    Addresses.Delete();
    SecretInformation.Delete();
    return DBLayer<Person>.Delete(this);
  }
}

Development Challenges

I encountered many challenges during the development; if you've written a data abstraction layer yourself, you may be very familiar with some of these issues (suggestions for alternate solutions are welcome!):

Fixed-length strings (CHAR- or NCHAR-typed columns) require input padded to the appropriate length when used in a query, insert, update, or delete statement. Our address entity contained a two-letter state abbreviation property. The column that stored the value in the database was defined as a column of type CHAR(2). When we went to set the value of this state abbreviation property to String.Empty, inserts and updates failed.

The database was (appropriately) expecting two characters, but I sure as heck didn't want to have to remember to write address.StateCode = " "; whenever I wanted to specify a blank state code. Conversely, I found myself spending a lot of time needlessly debugging new code when I forgot to call Trim() on these fixed-length string property values before performing string comparisons, etc. The problem was solved by creating a PadText attribute, used to properly format the property value before using it in a query, insert, update, or delete statement.

Similarly, strings that exceeded the capacity of their backing (N)VARCHAR column caused SQL exceptions during inserts, updates, and deletes. Although we used server-side validation to catch most invalid/excessively-lengthy input fields, I couldn't imagine writing (and maintaining) text-length validators for each multiline 'Description' textbox on every form. For instance, I had no qualms about truncating inappropriately-lengthy user input on several management and maintenance forms used only by trained administrators.

I created a LimitTextLength(int maxLength) attribute for properties backed by a limited length (N)VARCHAR column. If string length exceeded the specified max length, the string would be truncated before use in query, insert, update, or delete statements.

/// <summary>
/// Returns a padded and/or trimmed string based
/// on the padding/maxlength attributes attached to a property
/// </summary>
internal static string PadAndTrim(PropertyDescriptor property, string s)
{
    int padLength = CachedAttributes.GetTextPadding(property);
    int trimLength = CachedAttributes.GetMaxTextLength(property);

    // If no trimming or padding required, return the original string
    if (trimLength == Int32.MaxValue && padLength == 0)
        return s;

    int stringLength = s.Length;

    if (stringLength > trimLength)
    {
        // Return a trimmed string
        return s.Substring(0, trimLength);
    }
    else if (stringLength < padLength)
    {
        // Return a padded string
        return s.PadRight(padLength);
    }

    return s;
}

The site started bogging down - the client's $10.99/month shared hosting plan had always supported their 5000+ users in the past...

Using the DAL made it simple to retrieve information from the database. Naturally, we began to retrieve, use, and display additional (and arguably non-essential) information. This practice resulted in a significant increase in the number of database queries issued. In addition, the way we were using the DAL to perform multiple smaller (on-demand) lookups resulted in 'chatty' communications between the DAL and the database engine. To solve this problem, a specialized WHERE cache was implemented to store query results associated with specific query logic. If data had not been modified and results were requested using identical query logic, results would be retrieved from an in-memory cache. The WHERE cache allowed us to return our focus to writing simple, readable code (instead of analyzing execution paths to make sure the same data was not requested twice).

Clients reported that their passwords occasionally appeared to be case sensitive, whereas on the 'old system', passwords were never case sensitive (and they wanted passwords to remain case-insensitive...)

Unless otherwise specified, SQL uses case insensitive comparison when performing string comparison with (N)VARCHAR-typed column values. The 'old system' used a hardcoded SQL statement to query the login table for the username and password specified by the user (SQL injection attack, anyone?). However, when searching in-memory cache, I was using case-sensitive string comparison. The solution was to implement selective case sensitivity for in-memory searches (and thus allowing passwords to remain case-insensitive - wonderful!)

After adding a TEXT column to a table, we were unable to use the DAL to delete entities (when the entity did not have a primary key defined).

For some reason (...), we had several tables with no defined primary key (be nice - I had little say in the database design). When there was no primary key to identify a row to delete, Delete() attempts to match all database column values in order to ensure the correct row is deleted. However, because TEXT fields cannot be used in a WHERE clause, the delete command failed. By creating a SQLDBType attribute, we were able to exclude certain SQL data types (TEXT, IMAGE, BINARY, etc.) from constructed WHERE clauses. In addition, I added the IgnoreForDeleteStatements attribute to handle any other cases where the property value was not to be used in delete statements.

Step-by-step Instructions for Generating DAL-Enabled Classes from Database Tables

Create a new project (type: Class Library) that will contain all of your DAL-enabled classes for the project.
Add all DAL-related references to the project: DAL, EventArgLibrary, Extensions, ThreadSafeObjects, and HyperPropertyDescriptor.
Start up the SQLToDALStandalone application (included in the demo solution).
Type in your server name (and/or named database instance), and click the Show DBs button to display all available databases and tables.
Place check marks next to all database tables for which you would like to generate C# classes:

Press the "Generate Classes from Table" button. Class files (.cs or .vb) will be created and placed into SQLToDALStandalone's application folder.
Add the newly-generated classes to your new project. Open each file and auto-format (Alt-E, then V, then Enter in Visual Studio).

Specifying the Connection String to Use

In order to enable database connectivity, a connection string needs to be specified. If you do not do so, a "No connection string specified for class (YourClassNameHere)" exception will be thrown. There are several ways to supply DAL-enabled classes with a connection string:

Add the ConnectionString(string connectionString) attribute to your DAL-enabled classes - below is an example. Although this is the fastest in terms of getting your DAL-enabled classes up and running, it's also the least flexible:

[ConnectionString("workstation id=localhost;" + 
            "packet size=4096;Connect Timeout = 45;" +
            "data source=OWEN-PC;uid=DALUser;" + 
            "pwd=codeproject;persist security info=True;" + 
          "initial catalog=DALSample;")]
public partial class SecretInfo : DBLayer<SecretInfo>, 
               IDBLayer<SecretInfo>, INotifyPropertyChanged
...

Add your connection string as an application setting (confirm that it is located in the <appSettings> section of the App.Config file, and not nested within another subsection), and then attach the ReadConnectionStringFromAppSettings(string configKey) attribute to your class definition (demonstrated in the sample application).
Specify the connection string to be retrieved from a static method at runtime using the ReadConnectionStringFromMethod(string methodName) class attribute. methodName will be resolved using Reflection, and called when the DAL-enabled class first attempts to access the database. The method should be static, and the DAL expects a valid connection string as the method's return value. Example:

[ReadConnectionStringFromMethod("ReadConnectionString")]
public partial class Person : DBLayer<Person>, 
               IDBLayer<Person>, INotifyPropertyChanged
{
...
    /// <summary>
    /// Demonstrates another way to supply DAL
    /// with a connection string (at runtime)
    /// </summary>        
    public static string ReadConnectionString()
    {
        switch (RuntimeEnvironment)
        {
            case RuntimeEnvironment.Local:
                return Properties.Settings.Default.LocalConnectionString;
                
            case RuntimeEnvironment.Test:
                return Properties.Settings.Default.TestConnectionString;
                
            case RuntimeEnvironment.Production:
                return Properties.Settings.Default.ProductionConnectionString;
        }
    }
...
}

Specifying Database & Database-Specific Options

If you are using SQL Server as your back-end database, you don't have to do anything; by default, the built-in DALSqlServer class is used to handle communication with the database.
If you are using an OLEDB data provider, you should:

add the [IDALDatabaseClassName("DALSqlOleDb")] attribute to your class, instructing the data abstraction layer to instantiate an instance of the DALSqlOleDb class to handle communication with the database, and
add the [DataTransportType(DataTransport.SQLOLEDB)] attribute to your class to instruct the data abstraction layer to use ? as a placeholder in parameterized queries.

Testing Your DAL-Enabled Classes

Your DAL-enabled classes are now ready for use. :)

Open an existing application (or create a new sample application) and add the standard DAL-related references along with a reference to your Class Library project containing your DAL-enabled classes.
Because DAL-related functions are implemented as static methods of the DBLayer<T> generic base class, you can get Intellisense help by typing the name of your DAL-enabled class and hitting dot. Example: Person..
A simple test query (like the one below) is a fine way to confirm that the DAL is able to connect to the database and issue a table-specific query:

Console.WriteLine(UserInfo.DAL_GetRowCount().ToString());

Cache Limits and other DAL-Related Settings

Each DAL-enabled class type has an associated collection of settings that define certain aspects of the DAL-specific behavior for that class (examples include limiting cache size and using of dynamic indices when querying in-memory cache).

Setting Field Name	Setting Description
`bool CacheAndPerformance. _useDynamicIndices`	If set to `true`, the data abstraction layer attempts to create indices on-the-fly to increase speed when querying cached, in-memory results. For example, issuing several queries that search for `StateName == "CO"CO"` will result in an index being created that stores all records having a state name of "CO". When a query for "all people with last name Jones in Colorado" is issued, the dynamic index will be used to instantly retrieve all records with `StateName == "CO"`.
`<small>bool CacheAndPerformance. _track ConcurrentDBQueries</small>`	If set to `true`, the data abstraction layer keeps track of the number of concurrent, outstanding database queries. When enabled, the number of concurrent, outstanding database queries can be read from `CacheAndPerformance._currentSimultaneousDBQueriesies`.
`<small>bool CacheAndPerformance. _dbWrites OnSingleThread</small>`	Default is `false`. If set to `true`, the data abstraction layer will only allow a single database write operation to occur at any given time. I've used this flag in the past to debug a concurrency-related issue related to inserting multiple records while simultaneously searching in-memory cache.
`<small>int CacheAndPerformance. _dbHits BeforeFullPopulate</small>`	Default is 10. This setting only applies to DAL-enabled classes that have been marked as having a growable cache, using the `[Prepopulate(false)]` class attribute. When this number of subsequent queries miss the cache and results must be retrieved from the database, the DAL will read the entire backing table into memory in order to improve query speed and reduce database chatter. For extremely large data tables, this value should be set to `Int.MaxValue` in order to avoid the scenario where the DAL attempts to read the contents of a massive table into memory.
`<small>bool CacheAndPerformance. _useDynamicSort</small>`	Default is `true`. If set to `true`, the data abstraction layer will keep track of the class' most commonly queried properties. If a single property emerges as the most frequently queried, the DAL will sort the in-memory cached result collection by that property. After the cache has been sorted, when a query is issued that uses the sorted property, the DAL will leverage the collection's sort order to quickly locate and retrieve records. The current "best sort property" is stored in `public static string CacheAndPerformance._bestSortPropertyName`.
`<small>bool WhereCache. _disableWhereCache</small>`	Default is `false`. With the Where Cache enabled, query results are cached according to the logic used to retrieve them. The Where Cache is critical in ensuring that only a single database hit occurs when multiple queries are issued using duplicate query logic.
`<small>int WhereCache. _maxQueries InWhereCache</small>`	Default is 1000. If the Where Cache is enabled, this defines the maximum number of unique queries for which cached results will be stored.
`<small>int WhereCache. _maxObjects InWhereCache</small>`	Default is 400,000. This number limits the total number of object instances stored in the Where Cache. The current number of object instances in the Where Cache can be read from `int WhereCache._currentObjectCountInWhereCache`.
`<small>bool Cache. Suspend CacheInvalidation</small>`	While not a setting per se, this property can be used to temporarily turn off cache invalidation (and subsequent cache pre-population). This is especially useful when performing multiple inserts or updates against a pre-populating, cached DAL-enabled class. If cache invalidation is not suspended during execution of multiple update/insert/delete statements that cause intermittent cache invalidation, DAL may end up reading the entire table contents into memory several times; without suspending cache invalidation, as soon as DAL repopulates the pre-populated cache, the cache may be marked as invalid, causing the data to be re-fetched.

Advanced Features: Join and JoinMultiple

Note the JoinsTo(typeof(Person)) attribute attached to the person_id property of the Address class to the right. Similar to specifying a foreign key in the database, by attaching this attribute to the person_id property, you are informing the DAL that the value contained in the property Address.person_id references a specific person with the given identifier.

Once you have added the JoinsTo attribute, you can retrieve the entity pointed to by the foreign key using either of the following (note: using JoinNonNull ensures that a null value is not returned; if the referenced entity does not exist, an empty entity will be created and returned):

Person addressBelongsTo = address.Join<Person>();
//... or ...
Person addressBelongsTo = address.JoinNonNull<Person>();

The OneToMany attribute is used when there is a one-to-many relation between classes. In the example below, by attaching the [OneToMany(typeof(Address), "person_id")] attribute to the Person.person_id property, the DAL is informed that the value of the Address.person_id property on the Address table references the Person.person_id property on the Person table (and that there may be multiple addresses associated with a single person):

public partial class Person : 
       DBLayer<Person>, IDBLayer<Person>, INotifyPropertyChanged
{
...
    [MappedToColumn, Identity, PrimaryKey, QuickLookup]
    [OneToMany(typeof(Address), "person_id")]
    public long person_id
    {
        get { return _person_id; }
        set { Set<long>(ref _person_id, "person_id", value); }
    }
...
}

After marking the property with the OneToMany attribute, all addresses associated with a person can be retrieved using a single method call:

List<Address> addressesForPerson = person.JoinMultiple<Address>();

Other Useful Methods / Additional Functionality

The DBLayer<T> base class defines a virtual method InitializeStaticVariables(), which can be overridden in your DBLayer<T>-derived class. This method is called once when the DAL assigns a connection string for use by your class, and overriding this method may be useful for executing one-time initialization routines (such as initializing class-specific DAL-related settings) or performing runtime checks:

There are also several somewhat-tangent features included in the solution that are worth mentioning. These are:

DAL-enabled class collections support multi-property sorting. For details on how this is implemented, see Generic Multi-Field/Property Sorting for Lists of Business Objects. The example below sorts a list of Log entries by date (descending) and then by severity (also descending):

OrderBy<Log> orderBy = 
    new OrderBy<Log>("LogDate", true).AddOrderBy("Severity", true);

logEntries = logEntries.SortByProperties(orderBy);

The solution contains static methods used to remove duplicates in a strongly-typed list of entities. If a DAL-enabled class is identifiable by a unique key (an ID-related property has been marked with the Identity attribute), duplicates can be removed by calling myList.RemoveDuplicates(). If a property name is passed in as a parameter, RemoveDuplicates uses the value of specified property to determine if two items are duplicates:

List<Person> peopleList1 = ...
List<Person> peopleWithNoDupes = peopleList1.RemoveDuplicates();

List<Person> peopleList2 = ...
List<Person> personWithUniqueFullNames = 
     peopleList2.RemoveDuplicates("FullName");

List<Person> uniquePeople = 
     peopleList1.CombineAndRemoveDuplicates(peopleList2);

Appendix: DAL-Related Class Attributes

`DataTransportType(DataTransport transportType)`	Specifies the transport type used to communicate with the database. Specifically, when using parameterized queries, `DataTransport.SqlDirectConnection` uses named parameters whereas `DataTransport.SQLOLEDB` uses ordered parameters with ? placeholders.
`TableName(string tableName)`	Specifies the database table name associated with the class. If no table name is specified, the table name will be assumed to be the same as the class name.
`ConnectionString(string connectionString)`	Used to specify a hardcoded connection string.
`ReadConnectionString FromConnectionStringKey(string connectionStringKey)`	Reads the `<connectionStrings>` section of the executing assembly's app.config or web.config file, looking for the specified key name. When found, the `connectionString`'s value will be used as the connection string.
`ReadConnectionString FromAppSettings(string appKeyName)`	Reads the `<appSettings>` section of the executing assembly's app.config file, looking for the specified key name. When found, the app setting's key value will be used as the connection string.
`ReadConnectionStringFromMethod(string methodName)`	Specifies that the connection string should be retrieved at runtime by calling a static method of the class with the specified name.
`UsesMangling`	If attached to a class with a value of `true`, this attribute allows for marking individual string properties for simple data mangling.
`CollectionCacheable`	Marks a class as cacheable. Future support for time-related cache expiration.
`DontSearchAgainAfter ItemNotFound`	This class attribute is targeted at large tables that are infrequently updated (such as a help system). Classes marked with this property ensure that if a certain record is not found in the cache or the database, then subsequent queries for the same record will return no results without re-querying the database (until cache invalidation). This attribute must be accompanied by the `CollectionCacheable` attribute in order to have any practical use.
`PrepopulateCache`	Marking a class with `PrepopulateCache(false)` creates a growable cache - rows retrieved via queries during program execution will be cached. Using `PrepopulateCache(true)`, or not specifying this attribute, will cause the database table to be read into memory. All subsequent queries (unless the cache becomes dirty) will be performed in-memory.
`IDALDatabaseClassName(string className)`	The name of the class that will be instantiated at runtime in order to perform queries and execute commands against a specific database type. The class must implement the `IDALDatabase` interface.
`ICacheProviderClassName(string className)`	The name of the class that will be instantiated at runtime in order to handle data caching (must implement `ICacheProvider`). If not specified, defaults to the built-in `DBLayerCache` class.
`IDALWhereCacheClassName(string className)`	The name of the class that will be instantiated at runtime in order to store and retrieve results associated with specific query logic (must implement `IDALWhereCache`). If not specified, defaults to the built-in `DALWhereCache` class.
`IDALEncryptProvider ForEncryption(string className)`	The name of the class that will be instantiated at runtime to provide symmetric key encryption support when properties are marked with the `Rijindael` attribute (To do: rename the `Rijindael` attribute). If not specified, defaults to the built-in `EncryptionAndDataMangling` class.
`IDALEncryptProvider ForMangling(string className)`	The name of the class that will be instantiated at runtime to provide simple string mangling functionality to properties that are marked with the `MangleTextInDB` attribute. If not specified, defaults to the built-in `BasicStringMangler` class.

DAL-Related Property Attributes

`MappedToColumn(string columnName)`	Specifies the database column name in the table associated with this property. If no column name is specified, the column name is assumed to share the same name as the property.
`LimitTextLength(int maxLength)`	Attach this attribute to any string property that is not unlimited length in the database. For instance, if field `MENCID` is a `VARCHAR(10)` in the database, you should attach the attribute `[LimitTextLength(10)]` to the appropriate class property. If `MENCID` is a `CHAR(10)`, you should attach `LimitTextLength(10)` and `PadText(10)` to the property. This ensures that the property is trimmed or padded as necessary to 10 characters before writing (or searching) in the database.
`PadText(int padLength)`	For `CHAR` or `NCHAR` fields (not of variable length), you must set the padding on the property, specifying how many characters long the database expects the string to be. Property values will then be padded to the appropriate length for inserts/updates/deletes.
`DALIgnoreFor DeleteStatements`	Specifies that the column associated with the property should not be used when constructing `DELETE` statements.
`MangleTextInDB`	If this attribute is set on a string property, the string will be mangled before being inserted into the database, and de-mangled when retrieved. Use with the `UsesMangling` class attribute. `Rijindael` - expensive (high overhead) - decent symmetric-key based encryption when writing/reading the property value to/from the database.
`JoinsTo(Type otherClassType)`	Marks that this property value holds a foreign key (an identity value for an object in a different class), pointing (in a 1-to-1 manner) to a unique object of the specified type.
`OneToMany(Type joinsToType, string foreignKeyPropertyName)`	Specifies that this property serves as a foreign key for a different DAL-derived class type. Also requires the property name on the other class type that references the `OneToMany`-decorated property.
`ExtraColumnName(string columnName)`	If a result column from a SQL query matches the text specified by this attribute, the property will automatically be filled in by the value of that result column. Example: C# [ExtraColumnName("ProductDesc")] public string Description {set {...} get {...} } Any SQL query that returns a column named "`ProductDesc`" will auto-set the `Description` property based on the values returned.
`Identity`	Attach this attribute to the property corresponding to an identity/auto-increment column. When inserting an entity, the assigned identity/auto-increment value will be retrieved and will automatically populate the property.
`PrimaryKeyAttribute`	Marks the property as a primary key. Primary keys are used to quickly identify single records for deletion / updates.
`SQLDBTypeAttribute(SqlDbType dbType)`	(Mostly) Optional attribute. `SqlDbType.Text`, Image, or other non-searchable SQL DB typed columns cannot be used in queries. Therefore, marking the property as `SqlDbType.Text`, etc., will ensure that the property value is never used in a query. It may also be necessary to mark boolean properties with `SqlDbType.Bit`.
`QuickLookup`	For classes marked as cacheable, attaching this attribute "indexes" the property to which it is attached. Recommended for identity values of classes that are read from frequently, but written to infrequently. Accessed via `ClassName.Find[key]`. Example: `Person p = Person.Find[12];`.

Disclaimer

Although the DAL presented in this article handles the ASP.NET-database interaction for several websites and is being used in Windows Forms applications, I make no guarantees that there aren't still nasty bugs lurking around every corner. Use at your own risk.

Points of Interest

I hope you've enjoyed the article (and?)/or have found it useful. Comments/suggestions are welcome, and I'd be happy to publish future versions to CodeProject.

owen@binarynorthwest.com. Member BrainTech, LLC.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)