Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Extending C# to Support SQL Syntax at Compile Time

0.00/5 (No votes)
23 Jul 2014 1  
Ever wished you could truly embed SQL functionality in your C# code without using strings or late binding? Imagine being able to write complex Where clauses purely in C#.

Context

I regularly receive comments asking why I would create this code when LINQ and Entity Framework exist. It is important to note the original date of this article. It was written in mid 2005, well before LINQ was ever announced. So the actual project is no longer needed as it has been superceded by evolution of C# and VB.NET, however the project and article remain as an illustrative example how one can extend the languages beyond what most think is possible. And since the languages have further evolved with many other features since 2005, the possibilities are even greater today.

Introduction

Ever wished you could truly embed SQL functionality in your C# code without using strings or late binding? Imagine being able to write complex Where clauses purely in C#:

xQuery.Where = 
(CustomerTbl.Col.NameFirst == "Chad" | CustomerTbl.Col.NameFirst == "Hadi")
& CustomerTbl.Col.CustomerID > 100 & CustomerTbl.Col.Tag != View.Null;

Look closely. This is a C# code, not SQL. It's resolved and bound at compile time, but evaluated at run time. In this article I shall provide an introduction to this method and the full source code for your use.

C Omega

C Omega is a research programming language from Microsoft. C Omega is used to test out and try new ideas for future versions of C#. C Omega supports many items that are similar to what is provided by Indy.Data. However Indy.Data provides some more items that C Omega does not. And most important of all - Indy.Data is here now and usable in production code. C Omega is a research project and at best will be incorporated into a future version of C#.

In fact, if we look at some example C Omega code:

struct {
    SqlString CustomerID; 
    SqlString ContactName;
}* res
= select CustomerID, ContactName from DB.Customers;
foreach( row in res ) {
    Console.WriteLine("{0,-12} {1}", row.CustomerID, row.ContactName);
}

It looks very much like Indy.Data:

[Select("select `CustomerID`, `ContactName` from `Customers`")]
public class CustomerRow : View.Row {
public DbInt32 CustomerID;
public DbString ContactName;
}
using (Query xCustomers = new Query(_DB, typeof(CustomerRow))) {
    foreach (CustomerRow xCustomer in xCustomers) {
        Console.WriteLine("{0,-12} {1}", row.CustomerID, row.ContactName);
    }
}

Indy.Data does not quite have the syntax of C Omega, but with C Omega Microsoft has full flexibility to change the language. With Indy.Data we are restricted to how C# can be extended. Fortunately C# can be extended farther than most realize.

Future articles

The main assembly is called Indy.Data and relies on System.Data (ADO.NET). I have notes and plans for many articles around this library including not only the technical aspects, but the usage as well as the methodology. However in an effort to release the "idea" first I am writing this basic introduction article. I realize that there are many things that are not covered in this article, but don't worry they will appear in future articles.

Neither meat nor fish

Indy.Data is neither an O/R mapper nor a code generator in strict terms. Instead Indy.Data is something different, and something similar. In this article this will not be completely apparent, so for this article consider Indy.Data to be a Data Access Library (DAL) only.

Indy.Data does support code generation to keep the DAL objects in sync with your database, however it does not create mappings to business logic, nor does it generate its own SQL during generation.

Indy.Data is not an O/R mapper either. Indy.Data is more flexible in that it is not restricted to parameterized queries or stored procedures. Indy.Data also does not provide query construction and mapping, but instead relies on existing objects in the database, or provided SQL.

Indy.Data provides object wrappers on a 1:1 basis for database tables, views, stored procedures, or SQL statements. These objects can be regenerated at any time to be kept in sync with database changes. In future articles I will explain how Indy.Data can be used to perform the same functions as a code gen or O/R mapper, but in a slightly different way. However for the scope of this article assume Indy.Data to be an advanced implementation of an ADO.NET command object. Indy.Data is suitable to use in all those places where you would use an ADO.NET command object.

Extreme databases

ADO.NET is a good library for connecting to databases. But using ADO.NET still uses the standard methodology that data connectivity is not type safe, and that the binding to the database is loose. Fields are bound using string literals, or numeric indexes. All types are typecast to the desired types. Changes to the database will introduce bugs in the application. These bugs however will not be found until run time because of the loose binding. Unless every execution point and logic combination can be executed in a test, bugs will not appear until a customer finds them.

Because of this, as developers we have been conditioned to never ever change a database. This causes databases to be inefficient, contain old data, contain duplicate data, and contain many hacks to add new functionality. In fact this is an incorrect approach, but we've all grown accustomed to accepting this as a fact of development.

However if we use a tight bound approach as we do with our other code, we can upgrade and update our database to grow with our system. Simply change the database, and recompile. Your system will find all newly created conflicts at compile time and the functionality can then be easily altered to meet the new demand. I call this an "Extreme Database" or XDB, inline with Extreme Programming or XP.

Using the built in ADO.NET commands reading from a query is as follows:

IDbCommand xCmd = _DB.DbConnection.CreateCommand();
xCmd.CommandText = "select \"CustomerID\", \"NameLast\", \"CountryName\""
    + " from \"Customer\" C"
    + " join \"Country\" Y on Y.\"CountryID\" = C.\"CountryID\""
    + " where \"NameLast\" = @PNameLast1 or \"NameLast\" = @PNameLast2";
xCmd.Connection = _DB.DbConnection;
xCmd.Transaction = xTx.DbTransaction;

IDbDataParameter xParam1 = xCmd.CreateParameter();
xParam1.ParameterName = "@NameLast1";
xParam1.Value = "Hower";
xCmd.Parameters.Add(xParam1);
IDbDataParameter xParam2 = xCmd.CreateParameter();
xParam2.ParameterName = "@PNameLast2";
xParam2.Value = "Hauer";
xCmd.Parameters.Add(xParam2);
using (IDataReader xReader = xCmd.ExecuteReader()) {
    while (xReader.Read()) {
        Console.WriteLine(xReader["CustomerID"] + ": " + xReader["CountryName"]);
    }
}

The same code when written using Indy.Data is as follows:

using (CustomerQry xCustomers = new CustomerQry(_DB)) {
    xCustomers.Where = CustomerQry.Col.NameLast == "Hower"
                    | CustomerQry.Col.NameLast == "Hauer";
    foreach (CustomerQry.Row xCustomer in xCustomers) {
        Console.WriteLine(xCustomer.CustomerID + ": " + xCustomer.CountryName);
    }
}

First lets ignore the fact that it is shorter. The standard ADO.NET could be wrapped in an object or method as well. What I want to point out is the fact that the standard ADO.NET requires the use of text strings. In fact, the code listed above has a mistake in it. "@NameLast1" should be "@PNameLast1". The error in the standard ADO.NET code will not be detected until the code is executed, and tracing it down will be more difficult. While any mistake in the Indy.Data code will be caught immediately by the compiler, and the exact location pinpointed. This allows for databases to be easily evolved during development, and greatly reduces bugs as mistakes are found and pinpointed during compile.

Could this be done with a codegen or O/R mapper? Yes, but generally they offer one of the following:

  1. Method calls that create an awkward syntax not representative of the Where clause, especially when complex ones are encountered.
  2. Strings arguments, which leaves us again with a late bound interface causing bugs to be found at runtime.
  3. Database parameters - Parameters can give type safety and early binding if interpreted properly which many tools not only do, but rely on this behaviour. The problem is that parameters severely limit the flexibility of the Where clause and often cause many versions of a single object to be created with many variations of parameters.

With Indy.Data's approach, full freedom is retained to form the Where clause as needed and still retain all the benefits of type safety, early binding, and clean syntax.

One language

Traditionally developers building database systems had to learn not only a development language (C#), but also had to become quite versed in SQL and stored procedure language (such as T-SQL). To develop complex systems, a developer must not only be competent in all the three, but must split the system logic between the three languages. Even if you have a DA who writes all your stored procedures and SQL, it still splits your system logic into multiple languages.

With Indy.Data, SQL is still used, however the SQL that is used is simplified into a basic form and isolated into discrete pieces (Queries and views). All of the system logic can then be coded in C#. Since C# code can now be used to modify the basic SQL building blocks. In short, now you can develop your system using one language, C#. This is done by extending the C# language to handle Where clauses and other permutations of the SQL language.

Views

Indy.Data's primary object is the View.

View Types

Tables/Views

Views can exist for tables or views in a database.

Stored procedures

Support for stored procedures which return result sets is not implemented yet. Support for these types of stored procedures will be coming soon.

Queries

Custom SQL can be created and stored in a database as a view or a stored procedure. However during development the management of such views and especially the alteration of views can be quite difficult. For this case, Indy.Data also allows the developer to specify local SQL statements that are external to the database which are then embedded into a View. The SQL however remains separate and isolated from your code.

Generation

View classes are generated by an external utility. The utility is configured to examine a database and scan the database for tables, views, stored procedures, and external SQL statements. From these, it generates view base classes. From these base classes, it also generates a default shell for user extension.

Since the generator separates the classes into two, the base classes can be regenerated without overwriting the custom code. The generator is currently being ported and updated to support both Firebird and SQL Server. I expect to release this within a few days.

Examples

In this first article all of the examples are on a simple database. I will be expanding on these examples in later articles. These examples show the operations on a single table. Indy.Data can accept views as well as SQL statements.

Many examples have been taken from the NUnit test project. Because of this you will see a lot of assertions and other checks in the example code.

Database

For these examples, the following database is used.

CREATE TABLE "Customer" (
"CustomerID" INTEGER NOT NULL,
"NameFirst" VARCHAR(40) NOT NULL,
"NameLast" VARCHAR(40) NOT NULL,
"Tag" INTEGER
);
CREATE TABLE "Country" (
"CountryID" "KeyDmn" NOT NULL,
"CountryName" VARCHAR(50) NOT NULL,
"ISOCode" CHAR(2) CHARACTER SET ASCII NOT NULL,
CONSTRAINT "PK_Country" PRIMARY KEY ("CountryID"),
CONSTRAINT "UNQ_Country_1" UNIQUE ("ISOCode"),
CONSTRAINT "UNQ_Country_2" UNIQUE ("CountryName")
);

Code samples

Reading from a View

The basic form to read from a view is as follows:

using (CustomerTbl xCustomers = new CustomerTbl(_DB)) {
    xCustomers.SelectAll();
    foreach (CustomerTbl.Row xCustomer in xCustomers) {
        Console.WriteLine(xCustomer.NameFirst + " " + xCustomer.NameLast);
    }
}

Once the view is selected, a foreach can be used to iterate through the rows in the result set. The rows are not loaded into the memory, but are fetched one by one on demand.

The SelectAll is a safety catch that is built into Indy.Data. Before reading from a view, something must be selected. This rule was added in the SelectAll method because many bugs were occurring in our systems where the developers forgot to select data and were performing full result set scans. Because of this, Indy.Data throws an exception if a view is read from what has not been selected. In this case we wish to scan the whole view, so SelectAll is called.

Manual Read

using (CustomerTbl xCustomers = new CustomerTbl(_DB)) {
    xCustomers.SelectAll();
    CustomerTbl.Row xCustomer1 = (CustomerTblRow)xCustomers.Read();
    CustomerTbl.Row xCustomer2 = (CustomerTblRow)xCustomers.Read();
    Console.WriteLine(xCustomer1.NameFirst + " " + xCustomer2.NameFirst);
}

Instead of foreach, the manual Read method is used. Read returns null when the end of the result set is reached and can also be used with a while or other loop mechanism.

Since the row class is separate from the reader, rows can be saved and cached. For example you may wish to compare the current row to the last row. With Indy.Data this is easy, you just need to store a reference to the previous row.

Accessing columns

In the previous examples a column can be easily accessed:

string s = xCustomer1.NameFirst

NameFirst is a string to match the column. Other columns may be int, decimal or any other. But there are more properties to each column. Let's now look at the Tag column.

using (CustomerTbl xCustomers = new CustomerTbl(_DB)) {
    xCustomers.SelectAll();
    foreach (CustomerTbl.Row xCustomer in xCustomers) {
        if (!xCustomer.Tag.IsNull) {
            Console.WriteLine(xCustomer.Tag);
        } 
        else {
            Console.WriteLine(xCustomer.NameFirst + " " + xCustomer.NameLast);
        }
    }
}

Notice that xCustomer.Tag returns an int, but xCustomer.Tag.IsNull returns a bool. Those of you who have worked with ADO.NET DataSets will welcome the clean isolation of rows and columns that Indy.Data provides.

Null

"Null is a state, not a value".

Fortunately all major databases follow this properly. However nearly every data access layer treats accessing a value that is null as an error. This forces the developer to wrap all such accesses with an if statement and leads to many bugs. Null definitely is a state, however Indy.Data returns a default value when null exists. You can however explicitly detect the value of null. For example if Tag is null:

Tag == 0
Tag.IsNull == true

So you can read the value of Tag without an exception being thrown, yet detect null as a state. Numerics return 0, while a string returns "". I have found that in my code this has greatly reduced bugs, yet does not reduce the usefulness of null or violate the fact that null is a state. This is a purposeful design feature in Indy.Data, and not a unintended side effect or compromise.

Inserting a row

using (CustomerTbl xCustomers = new CustomerTbl(_DB)) {
    CustomerTbl.Row xCustomer = new CustomerTbl.Row();
    xCustomer.CustomerID = xCustomers.NewKey();
    xCustomer.NameFirst = "First";
    xCustomer.NameLast = "Last";
    xCustomers.Insert(xCustomer);
}

Updating a row

using (CustomerTbl xCustomers = new CustomerTbl(_DB)) {
    xCustomers.SelectFirstName("Chad");
    CustomerTbl.Row xCustomer = xCustomers.Read();
    // SQL Server can only have one data command per 
    // connection so we must close before update
    xCustomers.Close(); 
    xCustomer.NameLast = "Hauer";
    xCustomers.Update(xCustomer);
}

Updating a row is similar to inserting. However to update a row, we must first obtain the row somehow from the database. In this update, the first row with the first name of Chad is selected. Its last name is then changed. During the update, only the Last Name is updated. Indy.Data detects which columns have been updated and issues efficient SQL to update only those columns.

Filtering data

In updating a Row, the SelectFirstName was called. This is a method that was added in the user code section of CustomerTbl. Indy.Data could allow the developer to specify the Where clause in the code here, however for reasons of methodology (to be explained in a future article) it declares them as protected. This means that all the filter conditions must be isolated into methods and added into the view's user code. The code for SelectFirstName is as follows:

Where = Col.NameFirst == aName;

Note that the Where clause is pure C# code, and not dynamic SQL. This is a very simple example, however it can be made much more complex. Note that even parenthesis groupings are supported.

Where = 
(CustomerTbl.Col.NameFirst == "Chad" | CustomerTbl.Col.NameFirst == "Hadi")
& CustomerTbl.Col.CustomerID > aMinID & CustomerTbl.Col.Tag != View.Null;

Even the LIKE operator has been mapped to %:

Where = Col.NameFirst % aName;

The Where clause can even be constructed dynamically.

Where = CustomerTbl.Col.CustomerID > aMinID;
if (aNullTagsOnly) {
    _Where = _Where & CustomerTbl.Col.Tag == View.Null;
}

Of course, parenthesis and other operators still apply. & or | could have been used in the above code, as well as parenthesis grouping.

Other operations

Indy.Data supports many other operations and capabilities. This article is just an introduction and I will write others in the near future to expand on these.

DataBindings

Using WinForms or WebForms with DataBindings? The library can support this too. It can even be used to fill DataSets for multi-tier applications. I will cover this in another article.

Supported Databases

The library is designed to be database agnostic and work with any SQL based database using ADO.NET. Currently the library has been tested with Firebird and SQL Server. The library should work with Oracle, Interbase and others with minimal additions.

Implementation notes

Roots

The roots of Indy.Data go back to a library I had originally built in Delphi in 1998. Of course because Delphi did not support implicit converters, operators overload, and many other features I used, it was not nearly as smooth as Indy.Data. The first implementation in .NET was created in 2003.

.NET 2.0

The library is currently written for .NET 1.1. In many places it would benefit greatly from the use of generics, partial classes, and ADO.NET data provider factories. When .NET 2.0 is released, Indy.Data will be updated to take advantage of such functionality.

C#

The library is designed for C#. It may be used for other languages, however languages such as Visual Basic .NET do not support implicit conversions and other C# language features used by the library.

Open Source

Indy.Data is open source. If you are interested in using or contributing to Indy.Data, a yahoo group has been established where you can join.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here