Introduction
LINQ stands for Language-INtegrated Queries and basically makes queries first citizens of the .NET languages such as C# and VB.NET (i.e., makes them available in the IntelliSense of Visual Studio 2008).
We can use LINQ with any data source. We can express efficient query behavior in our programming language of choice, optionally transform / shape data query results into whatever format we want, and then easily manipulate the results. LINQ-enabled languages can provide full type-safety and compile-time checking of query expressions, and development tools can provide full IntelliSense, debugging, and rich refactoring support when writing LINQ code.
Using the Code
Why do we need LINQ?
Some of the basic questions that hit a developer when LINQ is introduced to him are:
- Why do we want a tool like LINQ?
- What makes the previous tools inconvenient?
- Was LINQ created only to make working with programming languages, relational data, and XML at the same time more convenient?
An answer to the questions will definitely enable us to see the importance of the LINQ project and appreciate it. It is a well known fact that a vast majority of applications that are developed access data or talk to a relational database. To program such applications, learning a language such as C# is not enough. We need to learn another language such as SQL, and the APIs that tie it together with C# to form the application.
The LINQ Project addresses the problem of Data! =Objects, in all areas such as Relational Data, XML Data etc. The data in these domains are completely different and were originally not created to work with each other with full compatibility. Also, we now use object oriented languages to access and manipulate data from these domains using objects which are another domain altogether. Thus we see that there is a serious mismatch or incompatibility between the different data stores which now are becoming dependent on the data from other stores.
To substantiate what we have just discussed above, let us consider the scenario of how relational data is incompatible with the object world:
- The data types of the relational databases and the object oriented languages behave differently.
- Databases, for performance reasons make use of normalization which results in an organization of data that is specific to the relational data model. Objects are organized based on the concepts of inheritance, composition and complex reference graphs. Databases do not have the concept of inheritance and so on. Thus, since the very models, that the two worlds are built upon, are different is becomes difficult to bridge the gap between them.
- The programming models are different. In SQL we write the exact query for the data we want. If we want the same result in C# we would have to iterate through our collection and perform some conditional operations and then collect the result. Thus the programming models are also different.
- In OO languages we see that the objects contain data as well as the behavior within themselves as one entity. This is quite different from the relational model where the data is completely separated from the code.
A Practical Example of the problem:
Let us look at a practical example of code and see how the two worlds are very different:
The .NET framework Class Libraries contains an API known as ADO.NET to access relational databases and to represent relational data in memory. This API consists of classes such as SqlConnection, SqlCommand, SqlDataAdapter, SqlDataReader etc. The problem with these classes is that they force the developer to work explicitly with tables, records, and columns, while modern languages such as C# and VB.NET use object-oriented paradigms.
LINQ helps us to increase productivity in data intensive programming by removing the burden of writing a lot of plumbing code written to map the object oriented model to relational databases or XML.
Consider the following piece of code written in .NET to connect to the database:
SqlConnection MyCon = new SqlConnection("Data Source = X; Initial Catalog =Y ; User ID = Z; Password=***");
SqlCommand MyComm = new SqlCommand();
MyComm.CommandText = @"Select Name,CustomerID FROM Customers WHERE City =@City";
MyComm.Parameters.AddWithValue("@City", "Mysore");
MyCon.Open();
MyReader = MyComm.ExecuteReader();
if (MyReader.HasRows)
{
string NameCustomer = MyReader.GetString(0);
}
else
{
MessageBox.Show("Sorry No Data");
}
We can note the following points regarding the code above:
- There are a several steps written for a simple functionality
- The queries written are just normal strings and thus at compile time will not be validated as a query but as a string. The query passes all compilation checks even if it is not syntactically correct
- The parameters are loosely defined. For Example, the column types defined here may not be the same as on the server. There is no way to check this at compile time.
- The provider specific dialects or codes cannot be used in this model if we go for Provider Agnostic Codes using the DBxxx classes.
Here we see the limitations in the existing model. So what is the answer?
Finding a Solution in LINQ
LINQ aims at deeply integrating the capabilities of data querying and manipulation languages into programming languages. LINQ removes the barriers among objects, databases and XML. It enables us to work with each one of them with the same language integrated facilities. For example now we can work with data coming from a relational database and data from XML within the same query. LINQ makes it easy to work with.
Some excellent benefits we get are:
- Queries are checked at compile time. We no longer need to wait till run time for an error in the query to be detected.
- We get IntelliSense with Visual Studio when writing LINQ queries which makes coding faster.
Let’s have a look at some of the design goals of LINQ.
Goal
| Motivation
|
Integrate objects, relational data and XML
| The need for unified query syntax for different languages and data sources and the need for a single processing model.
|
Having the power like SQL and XQuery in C# and VB
| Integrate querying abilities right into the programming model
|
Extensibility model for languages
| Enable the implementation for other programming languages
|
Type Safety
| Compile time checking to avoid problems
|
Debugger Support
| -
|
Two views on LINQ?
LINQ can be thought of as two complementary parts:
- A set of tools that work with data
- A set of programming language extensions
LINQ as a Set of Tools
LINQ definitely changes some aspects of handling and manipulating data with applications and components. There are a set of providers that are available in LINQ for the developer to use. These providers include LINQ to Objects, LINQ to XML, LINQ to DataSet, LINQ to SQL, LINQ to Entities etc. All of these providers are built upon a common foundation consisting of a set of building blocks like Query Operators, Query Expressions and Expression Trees. These blocks allow the LINQ tools to be extensible. Other variants of LINQ can be created to provide access to diverse kinds of data sources. It is possible to plug-in a wide range of data sources into LINQ including file system, ActiveDirectory, Windows Management Instrumentation (WMI), the Windows Event Log or any other data source or API.
LINQ as a Set of Programming Language Extensions
The LINQ providers are in no way stand alone tools. They can be used directly in any of the .NET programming languages. LINQ allows us to access information by writing queries against various data sources. We all are quite familiar with writing SQL queries. Even in the present day data access technologies we stick on to writing queries in the programming language but using the syntax which is specific to the source. LINQ provides the same type of expressive capabilities that SQL offers, but in the programming language of our choice. This enables us to write code that is shorter and to the point. For example the keywords from, where, orderby, select etc. now are a part of C#. This tells us that C# has been extended to enable language integrated queries.
Let’s Begin:
Let us begin going in to LINQ to SQL and reading data from the database.
Step 1: Have a table in the Database
I have created a table named Parcel in the database named MyDatabase on a server named MyServer for this example. The user credentials that I will be using to log in to the MSSQL Server 2005 (MyServer) are:
User ID: Arpan_Patro
Password: infy@123
A brief description of the table Parcel is as follows.
Column Name
| Column ID
|
ParcelID(Primary Key)
| Varchar(5)
|
Status
| Varchar(10)
|
ArriveCode
| Int
|
Step 2: Open a new Console Application in Visual Studio 2008
Step 3: Include the necessary namespaces
Well, now since we want to do something new with Visual Studio we need to add some new namespaces. These are already added in the VS2008 release.
using System.Data.Linq.Mapping;
using System.Linq;
using System.Data.Linq;
Note: If the namespaces do not come by IntelliSense then you may have to add the
System.Data.Linq
namespace manually and then try getting the namespaces by IntelliSense.
The obvious question arises ‘Why do I need these or what do I gain by adding these namespaces?
We’ll explore the reasons as we move ahead with the other steps.
Step 4: Creating the Classes (Using the Attributes Table and Column)
In this step we create objects in the C# Language that represent objects in the database. Objects are linked to relational data by decorating normal classes with attributes. Two of the most important attributes are Table and Column.
The Table attribute is used to decorate the class. One of its properties is Name; it is used to specify the name of the table to which an object of the class is linked. If the Name property is missing, the name of the class will be used for the name of the table. A class decorated with the Table attribute is called an entity class, and an instance of an entity class is called an entity. Only entities are stored in a database.
For example: In my example, I have a table named Parcel so in the Program.cs file I write the code:
[Table(Name = "Parcel")]
public class Parcel
{
………..
The Column
attribute is used to decorate fields or properties of an entity class, to associate them with columns of a table. The
Column
attribute has also several properties, one of them being the Name property. As in the case of the
Name
property of Table, it is used to specify the field in the table to which the field or property of the entity class is matched. If the property is not specified, the name of the field or property of the entity class will be assumed as the name of the field in the table.
In the example I have three columns which I specify as below:
[Table(Name = "Parcel")]
public class Parcel
{
[Column]
public string ParcelID;
[Column]
public string Status;
[Column]
public int ArriveCode;
}
It is important to know that the Table and the Column Attributes are found in the
System.Data.Linq.Mapping
namespace.
Step 5: Creating the DataContext
The DataContext
in simple terms is a means by which we can pass our queries to the relational database. It is in charge of translating our LINQ query to proper SQL queries, as well as communicating with the specified database. It behaves similar to the Connection object in ADO.NET. We must pass the Connection String to the constructor of the
DataContext
class. This enables the LINQ queries which are executed in the current data context (explained further) to be executed on the database specified by the given Connection String.
After creating the tables and Columns we now move on to create the DataContext
as follows giving your own credentials, server name.
[Database(Name = "MyDatabase")]
public class MyDataBase : DataContext
{
public Table<parcel> Parcel;
public MyDataBase()
: base("Data Source = MyServer ; Initial Catalog = MyDatabase ; User ID = Arpan_Patro ; Password = infy@123")
{ }
}</parcel>
The DataContext
class is found in the System.Data.Linq
namespace.
You can apply the Database
attribute to any strongly typed DataContext
declaration.
The Database
attribute is optional. If you use it, you must use the Name property to supply a name.
If you do not apply this attribute and the connection does not specify a name, the database is assumed to have the same name as the
DataContext
class.
What we have done above is a bit more than just make a data context we have modeled a database and specified as our table class that we previously created as a table here (notice the use of Generics here).
Step 6: Writing a LINQ Query
Having successfully followed the steps from 1 through 5 we now move on to write the LINQ
query as follows in the main method:
static void Main(string[] args)
{
MyDatabase MyConnection = new MyDatabase();
var Result = from myLinQ in MyConnection.Parcel
select new{
myLinQ.ParcelID, myLinQ.Status, myLinQ.ArriveCode};
foreach (var NewRow in Result)
{
Console.WriteLine("{0,-25} {1,-10}"+ "{2}",
NewRow.ParcelID.ToString(),
NewRow.Status.ToString(),NewRow.ArriveCode.ToString());
}
Var
used above is an anonymous type in C# 3.0.It automatically infers the type on the right hand side of the assignment operation and acts as that type.
If you notice the LINQ query holds a lot of resemblance to the SQL Query in its construct. The result from the database is always returned as an
IEnumerable
so we can use the foreach
loop to access the result set.
Step 6: Now Just Build and Execute
This shows how easy it is to use such a powerful tool and write complex logic in C# with data from a variety of sources. Similar code can be written to access data
from XML files, objects, entities, etc.
Summary
In this document we have seen:
- The importance of LINQ in providing a unified programming model.
- The design goals of LINQ
- How to use the LINQ to SQL provider
Reference(s)
- Definitions of the keywords from www.msdn.com
- LINQ in Action - Fabrice Marguerie, Steve Eichert and Jim Wooley, Manning Publications.