Visit the project homepage at CodePlex.
Introduction
The LINQ project that will be part of the next version of Visual Studio (codename "Orcas") is a set of extensions that make it possible to query data sources directly from the C# or VB.NET languages. LINQ extends the .NET Framework with classes to represent queries and both C# and VB.NET language with features that make it possible to write these queries easily. It also includes libraries for using queries with the most common types of data sources like SQL database, DataSets and XML files. This article requires some basic knowledge of LINQ and C# 3.0, so I recommend looking at the LINQ Overview available from the official project web site before reading the article.
LINQ includes extensions for C# and VB.NET, but there are no plans for supporting LINQ in C++/CLI. The goal of the CLinq project is to allow using part of LINQ's functionality from C++/CLI. Thanks to a very powerful operator overloading mechanism in C++/CLI it is possible to enable use of LINQ in SQL for accessing SQL databases in C++/CLI, as well as some other LINQ uses. I will first demonstrate how the same database query looks in C# 3.0 and C++/CLI. Then we will look at CLinq in more detail. The following query, written in C# 3.0, uses the Northwind database and returns name of contact and company for all customers living in London:
NorthwindData db = new NorthwindData(".. connection string ..");
var q =
from cvar in db.Customers
where cvar.City == "London"
select cvar.ContactName + ", " + cvar.CompanyName;
foreach(string s in q)
Console.WriteLine(s);
Now, let's look at the same query written in C++/CLI using CLinq. It is a bit more complex, but this is the price for implementing it as a library instead of modifying the language:
NorthwindData db(".. connection string ..");
Expr<Customers^> cvar = Var<Customers^>("c");
CQuery<String^>^ q = db.QCustomers
->Where(clq::fun(cvar, cvar.City == "London"))
->Select(clq::fun(cvar,
cvar.ContactName + Expr<String^>(", ") + cvar.CompanyName));
for each(String^ s in q->Query)
Console::WriteLine(s);
LINQ and C++/CLI overview
In this section, I'll very shortly recapitulate a few LINQ and C++/CLI features that are important for understanding how CLinq works. If you're familiar with LINQ and C++/CLI, you can safely skip this section.
Important LINQ features
Probably the most important extensions in C# that make LINQ possible are lambda expressions. Lambda expressions are similar to anonymous delegates, but the syntax is even simpler. Lambda expressions can be used for declaring functions inline and you can pass them as a parameter to methods. There is, however, one important difference to anonymous delegates: lambda expressions can be either compiled as executable code, like anonymous delegates, or as a data structure that represents the lambda expression source code. The structure is called an expression tree. Expression trees can be also compiled at runtime, so you can convert this representation to executable code.
What LINQ to SQL does is take the expression tree representing the query that contains lambda expressions and converts it to the SQL query, which is sent to the SQL Server. LINQ to SQL also contains a tool called sqlmetal.exe, which generates objects that represent the database structure. So, when you're writing the queries you can work with these type-safe objects instead of having to specify database tables or columns by name.
Important C++/CLI features
Now I'd like to mention a few from the rich set of C++/CLI features. LINQ itself is available for .NET, so we'll use the ability to work with managed classes a lot in the whole project. We'll also use the ability to work with both C++ templates and .NET generics. CLinq benefits from the fact that .NET generics can be compiled and exported from an assembly, while C++ templates are interesting thanks to their template specialization support. This means that if you have a SomeClass<T>
template, you can write a special version for specified type parameters -- for example, SomeClass<int>
-- and modify the behavior of this class, including the possibility to add methods, etc.
Basic CLinq features
In the previous example, we used the Expr<Customers^>
and Var<Customers^>
classes. These two classes are typed wrappers and are declared using C++/CLI templates. We use templates instead of generics because templates allow us to use template specialization. This means that there are basic Expr<>
and Var<>
classes and these can be specialized. For example, Expr<Customers^>
can contain some additional properties. Using these additional properties, you can express operations with the Customers
class. These template specializations can be generated using the clinqgen.exe tool, which will be described later. CLinq also supports a bit more complex syntax you can use for manipulating with classes that don't have template specializations.
Before we start, I'll explain how the CLinq library is organized. It consists of two parts. The first part is the EeekSoft.CLinq.dll assembly, which contains core CLinq classes. You'll need to reference this assembly from your project either using project settings or with the #using
statement. The second part is the clinq.h header file and two other headers that contain C++/CLI templates. You'll need to include this header in every CLinq project. The header files are used because CLinq relies on C++/CLI templates. The classes from the core library can be used if you want to share CLinq objects across more .NET projects.
I already mentioned the Expr<>
class. This class is written using templates and it is included, together with Var<>
, from the clinq.h file. These two are inherited from classes in the CLinq assembly, namely Expression<>.
There are some other classes in the assembly, but this one is the most important. This class can be shared in multiple projects and it is written using .NET generics. It is recommended to use this class as a type for the parameters of any public methods from your project that can be called from other .NET assembly.
Expr and Var classes
Let's look at some sample code. As you can see from the previous paragraph, the Expr<>
and Var<>
classes are key structures of the CLinq project, so we'll use them in the following example. The example works with two specialized versions of these classes, one for the int
and second for the String^
type:
Expr<int> x = Var<int>("x");
Expr<String^> str("Hello world!");
Expr<int> expr = str.IndexOf("w") + x;
If you look at the code, you could think that the IndexOf
method and other operations are executed after the code is invoked, but this isn't true! This is an important fact to note: The code only builds internal structures that represent the expression, but the expression is not executed! This provides you with a similar behavior to the C# 3.0 lambda expressions, which can also be used for building representations of the written expression instead of building executable code. You can also convert the expression represented by the Expr<>
class to the structures used by LINQ, as demonstrated in the following example:
System::Expressions::Expression^ linqExpr = expr.ToLinq();
Console::WriteLine(linqExpr);
The result printed to the console window will be:
Add("Hello world!".IndexOf("w"), x)
Lambda expressions
Let's now look at the syntax for writing lambda expressions in CLinq. Lambda expressions are represented by the generic Lambda<>
class. The type parameter of this class should be one of the Func
delegates declared by LINQ in the System::Query
namespace. For declaring lambda expressions, you can use the fun
function in the EeekSoft::CLinq::clq
namespace. Assuming that you included the using namespace EeekSoft::CLinq;
directive, which is recommended, the source code will look like this:
Expr<int> var = Var<int>("x");
Expr<int> expr = Expr<String^>("Hello world!").IndexOf("w") + var;
Lambda<Func<int, int>^>^ lambda = clq::fun(var, expr);
Console::WriteLine(lambda->ToLinq());
Func<int, int>^ compiled = lambda->Compile();
Console::WriteLine(compiled(100));
After executing this example, you should see the following output in the console window. The first line represents the lambda expression and the second line is the result of lambda expression invocation:
x => Add("Hello world!".IndexOf("w"), x)
106
Similarly to LINQ, you can compile the CLinq expression at runtime. Actually, CLinq internally uses LINQ. This was done in the previous example using the Compile
method. The returned type is one of the Func<>
delegates and this delegate can be directly invoked.
As in LINQ, you can use only up to 4 parameters in lambda expressions. This is due to the limitations of Func<>
delegates declared in LINQ assemblies. Accordingly to this limitation, the clq::fun
function has the same number of overloads. Also note that you don't have to specify type arguments to this function in most situations because the C++/CLI type inference algorithm can infer the types for you. Let's look at one more example that demonstrates declaring lambda expressions with more than one parameter:
Expr<int> x = Var<int>("x");
Expr<int> y = Var<int>("y");
Lambda<Func<int, int, int>^>^ lambda2 =
clq::fun(x, y, 2 * (x + y) );
Console::WriteLine(lambda2->Compile()(12, 9));
In this example, the body of the lambda expression isn't declared earlier as another variable, but composed directly in the clq::fun
function. We also used overloaded operators, namely *
and +
, in the body of the lambda expression. If you run this code, the result will be (12 + 9) * 2
, which is 42
.
Supported types and operators
In the previous example, I used two overloaded operators. These operators are declared in the Expr<int>
template specialization. You can use them when working with an expression representing integer. CLinq includes template specializations with overloaded operators for the following standard types:
Type | Supported operators & methods |
bool | Comparison: != , == ; Logical: && , || , ! |
int | Comparison: != , == , < , > , <= , >= ; Math: + , * , / , - ; Modulo: % ; Shifts: << , >> |
Other integral types | Comparison: != , == , < , > , <= , >= ; Math: + , * , / , - ; Modulo: % |
float , double , Decimal | Comparison: != , == , < , > , <= , >= ; Math: + , * , / , - |
wchar_t | Comparison: != , == |
String^ | Comparison: != , == ; Concatenation: + ; Standard string methods (IndexOf , Substring , etc..) |
For a complete list of supported types with a list of methods and operators, see the generated documentation (14.9 Kb). The following example demonstrates using overloaded operators with expressions representing double
and float
. Mixing different types is another interesting problem, so that's why we use two different floating point types here:
Expr<float> fv = Var<float>("f");
Expr<double> fc(1.2345678);
Lambda<Func<float, float>^>^ foo4 = clq::fun(fv,
clq::conv<float>(Expr<Math^>::Sin(fv * 3.14) + fc) );
You can see that we're using another function from the clq
namespace, clq::conv
. This function is used for converting types when implicit conversion is not available. In the sample, we're using a Sin
function which accepts Expr<double>
as a parameter. The variable of type float
is converted to the expression of type double
implicitly, but when conversion in the opposite direction is not possible we have to use the clq::conv
function. CLinq allows implicit conversion only from a smaller floating point data type to larger -- i.e. float
to double
-- or from a smaller integral type to larger, for example, short
to int
. This example also uses the Expr<Math^>
class, which is another interesting template specialization. This specialization represents the .NET System::Math
class and contains most of the methods from this class.
Working with classes
I already demonstrated how you can work with basic data types like int
or float
, but I mentioned only a little about working with other classes. There are two possible approaches: You can use template specialization, if it exists, which includes properties and methods that represent members of underlying classes. These specializations exist for some standard types, like String^
, and can be generated for LINQ to SQL database mappings. If template specialization isn't available, you have to use common methods that can be used for invoking a method or property by its name.
Typed wrappers
Using class if the corresponding template specialization exists is fairly simple. The following example declares an expression working with the String^
type:
Expr<String^> name = Var<String^>("name");
Expr<String^> sh("Hello Tomas");
Expr<int> n = sh.IndexOf('T');
Lambda<Func<String^, String^>^>^ foo =
clq::fun(name, sh.Substring(0, n) + name);
Console::WriteLine(foo->ToLinq());
Console::WriteLine(foo->Compile()("world!"));
In this example, we use two methods that are declared in the Expr<String^>
class. These methods are IndexOf
and Substring
; they represent calls to the corresponding methods of the String^
type. If you look at the program output, you can see that it contains calls to these two methods. There is also a call to the Concat
method, which was generated by CLinq when we used the +
operator for string concatenation:
name => Concat(new [] {"Hello Tomas".
Substring(0, "Hello Tomas".IndexOf(T)), name})
Hello world!
Indirect member access
To demonstrate the second approach, we'll first define a new class with a sample property, method and static method. You can also invoke static properties:
ref class DemoClass
{
int _number;
public:
DemoClass(int n)
{
_number = n;
}
property int Number
{
int get()
{
return _number;
}
}
int AddNumber(int n)
{
return _number = _number + n;
}
static int Square(int number)
{
return number * number;
}
};
Now let's get to the more interesting part of the example. We will first declare a variable of type DemoClass^
and later we'll use the Prop
method to read a property by its name. We'll use Invoke
to call a member method and InvokeStatic
to invoke a static method of this class. The AddNumber
method could be a bit tricky because it increments a number stored in the class as a side-effect, which means that the value of the expression depends on the order in which members of the expression are evaluated:
Expr<DemoClass^> var = Var<DemoClass^>("var");
Lambda<Func<DemoClass^,int>^>^ foo = clq::fun(var,
var.Prop<int>("Number") +
var.Invoke<int>("AddNumber", Expr<int>(6)) +
Expr<DemoClass^>::InvokeStatic<int>("Square", Expr<int>(100) ) );
DemoClass^ dcs = gcnew DemoClass(15);
int ret = foo->Compile()(dcs);
Console::WriteLine("{0}\n{1}", foo->ToLinq(), ret);
And the output of this example would be:
var => Add(Add(var.Number, var.AddNumber(6)), Square(100))
10036
I included the output because I wanted to point out one interesting fact. You can see that there is no difference in the output whether you use generated template specialization or invoke by name. This is because if you're using invoke by name, the method or property that should be invoked is found using reflection before the LINQ expression tree is generated. This also means that if you execute the compiled lambda expression, it will call the method or property directly and not by its name.
Calling constructors in projection
So far, we've looked at calling methods and reading property values. There is one more interesting problem that I didn't write about. Sometimes you may want to create an instance of a class and return it from a lambda expression. CLinq doesn't support anything like C# 3.0's anonymous methods, but you can invoke a class constructor and pass parameters to it using the clq::newobj
function. The following sample assumes that you have a class called DemoCtor
with a constructor taking String^
and int
as parameters:
Expr<String^> svar = Var<String^>("s");
Expr<int> nvar = Var<int>("n");
DemoCtor^ d = clq::fun(svar, nvar, clq::newobj<DemoCtor^>(svar, nvar) )
->Compile()("Hello world!", 42);
After executing this code, the d
variable will contain an instance of the DemoCtor
class created using the constructor that I wrote about earlier. You should be very careful when using the newobj
method because there is no compile-time checking. So, if the required constructor doesn't exist or has incompatible types, the code will end with a run-time error.
Using LINQ
You're now familiar with all of the CLinq features that you need to start working with data using LINQ in C++/CLI! The key for working with data is the CQuery
class. It serves as a CLinq wrapper for the IQueryable
interface, which represents a query in LINQ. This class has several methods for constructing queries, including Where
, Select
, Average
and others. You can construct this class if you already have an instance of a class implementing the IQueryable
interface, but for working with database, you can use a tool to generate code that makes it simpler. The CQuery
class also has a property called Query
that returns the underlying IQueryable
interface. We'll need this property later for accessing the results of the query.
Working with SQL Database
LINQ to SQL: Introduction
We will use two tools to generate a CLinq header file with classes that will represent the database structure. The first tool is shipped as part of LINQ and is called sqlmetal
. This tool can generate C# or VB.NET code, but it can also be used to generate an XML description of the database structure. We will use the third option: The following example demonstrates how to generate an XML description, northwind.xml, for the database Northwind
on SQL server running at localhost
:
sqlmetal /server:localhost /database:northwind /xml:northwind.xml
Once we have the XML file, we can use the clinqgen
tool that is part of CLinq. This tool generates C++/CLI header files with classes that represent database tables according Expr<>
template specializations and also the class that will represent the entire database. You can customize the name and namespace of this class. If you want to automate this task, you can include the XML file generated by sqlmetal
in your project and set its custom build tool to the following command. Hacker note: You can also use pipe (|
) to get these two tools working together.
clinqgen /namespace:EeekSoft.CLinq.Demo
/class:NorthwindData /out:Northwind.h $(InputPath)
Now you'll need to include the generated header file and we can start working with database. We'll first create an instance of the generated NorthwindData
class, which represents the database. Note that the example uses C++/CLI stack semantics, but you can also use gcnew
if you want instead. Once we have an instance of this class, we can use its properties that represent data tables. The properties with the Q
prefix return the CQuery
class. So, we'll use these properties instead of properties without this prefix, which are designed for using from C# 3.0 or VB.NET. The following example demonstrates some basic CQuery
methods:
NorthwindData db(".. connection string ..");
Console::WriteLine("Number of employees: {0}",
db.QEmployees->Count());
Expr<Products^> p = Var<Products^>("p");
Nullable<Decimal> avgPrice =
db.QProducts->Average( clq::fun(p, p.UnitPrice) );
Console::WriteLine("Average unit price: {0}", avgPrice);
Expr<Employees^> e = Var<Employees^>("e");
Employees^ boss = db.QEmployees->
Where( clq::fun(e, e.ReportsTo == nullptr) )->First();
Console::WriteLine("The boss: {0} {1}",
boss->FirstName, boss->LastName);
In the first example, we simply called the Count
method, which returns the number of rows in the table. In second example, we used Average
method that requires one argument, which is a lambda expression that returns a numeric type for every row in the table. Since the UnitPrice
column can contain NULL
values, we're working with the Nullable<Decimal>
type. It can contain either a real value or NULL
, which is represented using nullptr
in C++/CLI. The third example used the Where
method to filter only rows matching the specified predicate, i.e. lambda expression. The result of this call is also CQuery
class, so we can easily concatenate multiple operations. In this example, we append a call to the First
method, which returns the first row from the result set.
LINQ to SQL: Filtering & projection
Let's look at the more interesting sample that covers filtering -- i.e. the Where
method -- and projection, i.e. the Select
method. The result of the query will be a collection containing instances of the custom class called CustomerInfo
. So, let's first look at this class:
ref class CustomerInfo
{
String^ _id;
String^ _name;
public:
CustomerInfo([PropMap("ID")] String^ id,
[PropMap("Name")] String^ name)
{
_id=id; _name=name;
}
CustomerInfo() { }
property String^ ID
{
String^ get() { return _id; }
void set(String^ value) { _id = value; }
}
property String^ Name
{
String^ get() { return _name; }
void set(String^ value) { _name = value; }
}
};
The class has two properties -- ID
and Name
-- one parameter-less constructor and one constructor that needs further explanation. The constructor takes two parameters, which are used to initialize both of the two fields of the class. There is also an attribute called PropMap
attached to every parameter, which describes how the constructor initializes the properties of the class. For example, the attribute [PropMap("ID")]
attached to the id
parameter means that the value of the ID
property will be set to the value of the id
parameter in the constructor.
Why is this information important? First, it will not be used in the following query, but you could write a query that constructs a collection of CustomerInfo
objects and later filters this collection using the Where
method. The whole query will be passed to LINQ for conversion into SQL. If you use the ID
property for the filtering, LINQ needs to know what value was assigned to this property earlier. For this reason, CLinq has the PropMap
attribute, which maps property values to parameters passed to the constructor earlier. In C# 3.0, the behavior is a bit different because you can use anonymous types and you don't need to pass values directly to the constructor.
NorthwindData db(".. connection string ..");
Expr<Customers^> cvar = Var<Customers^>("c");
CQuery<CustomerInfo^>^ q = db.QCustomers
->Where(clq::fun(cvar, cvar.Country.IndexOf("U") == 0))
->Select(clq::fun(cvar, clq::newobj<CustomerInfo^>(
cvar.CustomerID, cvar.ContactName +
Expr<String^>(" from ") + cvar.Country)));
Console::WriteLine("\nQuery:\n{0}\n\nResults:",
q->Query->ToString());
for each(CustomerInfo^ c in q->Query)
Console::WriteLine(" * {0}, {1}", c->ID, c->Name);
This code is quite similar to the code that you usually write when working with LINQ in C# 3.0. In this sample, we first create the database context and declare a variable that will be used in the query. The query itself takes the QCustomers
property representing the Customers table in database. Using the Where
method, it then filters customers from countries starting with the letter "U." Finally, it performs projection, i.e. Select
method, where it selects only information that we're interested in and creates the CustomerInfo
object.
The sample also prints the SQL command that will be generated from the query. LINQ returns the SQL command if you call the ToString
method on the IQueryable
representing the query. As I mentioned earlier, the underlying IQueryable
of the CQuery
class can be accessed using the Query
property. So, the code q->Query->ToString()
returns the SQL command. The last thing that the code does is execute the query and print information about all returned customers. The query is executed automatically when you start enumerating over the collection, which is done in the for each
statement.
LINQ to SQL: Joins & tuples
For the last example, I wrote a much more complex query. It first performs the GroupJoin
operation on customers and orders, which means that it returns a collection of tuples containing the customer and all her orders. After this join, it performs Where
filtering and returns only customers who have at least one order that will be shipped to USA. The customers are still kept together with their orders. The last operation done by the query is a projection where it generates a string with the name of the company and the number of orders associated with it.
This query also demonstrates a few more interesting things that we didn't need earlier. The example starts with two typedef
s to make the code more readable. The first just defines a shortcut for the collection of orders. The second uses the Tuple
class, which is a part of CLinq that I didn't talk about yet. Tuple
is a very simple generic class with two type parameters that contain two properties -- called First
and Second
-- that have the type determined by the type parameters. You can use this class if you want to return two different values from a projection or join without declaring your own class.
The query returns the Tuple
type from the projection and later uses the Where
operation to filter the customers. This reveals one advantage to using the predefined Tuple
class: The co
variable whose type is the expression representing the tuple Expr<Tuple<>^>
is passed as a parameter to the lambda expression. In the lambda expression, we can directly use its properties First
and Second
. Because we're manipulating with expressions, we're not working with the Tuple
class directly. Rather, we're working with template specialization of the Expr
class, in which the Expr<Tuple<>^>
is expanded to contain these two properties. I'll comment other interesting features used in this example later, so let's look at the query now:
typedef IEnumerable<Orders^> OrdersCollection;
typedef Tuple<Customers^, OrdersCollection^> CustomerOrders;
NorthwindData db(".. connection string ..");
Expr<Customers^> c = Var<Customers^>("c");
Expr<Orders^> o = Var<Orders^>("o");
Expr<OrdersCollection^> orders
= Var<OrdersCollection^>("orders");
Expr<CustomerOrders^> co = Var<CustomerOrders^>("co");
CQuery<String^>^ q = db.QCustomers
->GroupJoin(db.QOrders,
clq::fun(c, c.CustomerID),
clq::fun(o, o.CustomerID),
clq::fun<Customers^, OrdersCollection^, CustomerOrders^>
( c, orders, clq::newobj<CustomerOrders^>(c, orders) ))
->Where( clq::fun(co, co.Second.Where(
clq::fun(o, o.ShipCountry == "USA" )).Count() > 0) )
->Select( clq::fun(co,
co.First.CompanyName + Expr<String^>(", #orders = ") +
Expr<Convert^>::ToString(co.Second.Count()) ) );
Let's focus on the Where
clause. The lambda expression accepts an expression of type Tuple
, which I explained earlier, as a parameter and it accesses its second value, co.Second
. The type of this parameter is an expression representing a collection, Expr<IEnumerable<>^>
. This is another specialization of the Expr<>
class and, using InteliSense, you can discover that this class has a lot of methods for working with collections! These methods correspond to the methods available in the CQuery
class, but are designed for working with expressions representing queries instead of working with queries directly. In this example, we use the Where
method, which returns an expression representing a query again and also the Count
method.
The second class that wasn't mentioned earlier is Expr<Convert>
, which is just another template specialization similar to Expr<Math>
. It contains several methods for type conversions. In this example, we use the ToString
method for converting the number of orders to string.
Project summary
Currently, the project is in a very early phase. This means it needs more testing and also review from other people. If you find any bugs or if you think that CLinq is missing some important LINQ functionality, let me know. The project currently uses the May 2006 CTP version of LINQ, but it will be updated to support Visual Studio "Orcas" once more stable beta versions become available. The project is available at CodePlex [^], so you can download the latest version of the source code and binaries from the project site. Because I'm not a C++/CLI expert, I'm very interested in your comments and suggestions. Also, if you're willing to participate in the project, let me know!
Version & updates
- (2nd March 2007) First version, using LINQ May 2006 CTP
- (27th July, 2007) Article edited and moved to the main CodeProject.com article base