Table of Contents
Part 1 of this series will discuss the following:
- Introduction
- Language Features Supporting LINQ
- Extension Method
- Lambda Expression
- Local Variable Type Inference
- Anonymous Type
- Object Initialization
Introduction
So let's get started directly with the code:
int[] array = new int[] { 1, 2, 3,
4, 5, 6, 7, 8, 9, 10 };
var evenNumbers = from a in array
where a % 2 == 0
select a;
ObjectDumper.Write(evenNumbers);
The above declares an array of integers and prints only even numbers.
The output of the above code is as follows:
2
4
6
8
10
ObjectDumper
is a utility class which uses reflection and does a Console.WriteLine()
. ObjectDumper
is a smart class which accepts any object, infers its type using reflection and prints the values to a stream. By default, the stream is console. So if I have a class Customer
and it has a public
field as Name
, and I pass the object of this class to ObjectDumper
, it prints in the following fashion Name=<customerName>
. If I pass a collection, it iterates on all the items in the collection and prints their respective values. This class can be used for .NET 2.0 as well.
Now let's see how this code will look like to a C# compiler. Architects call this code as syntactic sugar. If you look at the code, there is no method call, no objects, and no "." Operator.
As soon as the C# compiler looks at the above code, it translates this code as shown below:
var evenNumbers = array.Where(a => 0 == a % 2).Select(a => a);
Now, as a developer I am happy to see some objects and a method call. But still I am confused as an int
array does not have Where
and Select
methods. Has Microsoft added these new functions to this type? And what is "a => 0 == a % 2
"?
Extension Method
Where
and Select
are extension methods. .NET 3.5, which is built over .NET 2.0, allows a developer to add her/his own functions in existing classes/types. But there are rules for adding an extension method. You can add a method to a type from a static
class only. This static
class should contain a static
method which will be called as instance method on that type. Let's take an example of what I mean by that:
public static class MyExtensions
{
public static string Reverse(this string str)
{
StringBuilder sb = new StringBuilder(str.Length);
for (int i = str.Length - 1; i >= 0; i--)
sb.Append(str[i]);
return sb.ToString();
}
}
Here we have added a Reverse
function to String
class. Now let's see how we can use this function:
string s = "Hello LINQ";
Console.WriteLine(s.Reverse());
Things to note:
Reverse
function is a static
function and is called as an instance function Reverse
functions accept a string
as an argument, whereas the call to the function is void
- Note the function has a "
this
" for its first parameter.
When the compiler sees the call to the function which is not a member function of that type, it looks for an extension method and calls the function which is the closest match. This means if I have another namespace which has Reverse
as an extension method, the compiler will either give an ambiguous error or call the first function closest to the function call. To avoid ambiguous error, you can explicitly mention the namespace class and method you want to invoke. In our case, when the compiler sees this call, it replaces it with the following:
Console.WriteLine(MyExtensions.Reverse(s));
Lambda Expression
Coming back to the following code:
var even = array.Where(a => 0 == a % 2).Select(a => a);
Now we know that Where
and Select
are extension methods. But what about "".
For a .NET 2.0 developer, the above code is equivalent to the code below:
var e = array.Where(delegate(int a)
{ return 0 == a % 2; }).Select(delegate(int a) {return a;});
Where extension method looks as follows:
public delegate TR Func<T0, TR>(T0 a0);
public static IEnumerable<T> Where<T>(this IEnumerable<T> source,
Func<T, bool> predicate)
{
if (source == null || predicate == null)
throw new ArgumentNullException();
foreach (T item in source)
if(predicate(item))
yield return item;
}
This extension method says it is applicable for all IEnumerable<T>
types. It iterates over all the items and returns only those types which satisfy a given condition (predicate).
Now let's look at "Where(a => 0 == a % 2).Select(a => a);
". The arrow (=>
) operator is introduced and the whole expression is called lambda expression. Let's see more examples on lambda expression:
public delegate T Func<A0, A1, T>(A0 arg0, A1 arg1);
Func<int, int, int> f = (x, y) => x * y;
ObjectDumper.Write(f(5, 6));
Output of the above code is 30
.
What the compiler does is it infers the type of x
and y
which makes it type safe and then creates an anonymous delegate function and calls it. This makes more sense to object-oriented people.
Local Variable Type Inference
Again coming back to the main query:
var evenNumbers = from a in array
where a % 2 == 0
select a;
If you have not yet noted, the output of the query is assigned to var
. This var
does not correspond to object in Jscript where it means an object. In C#, its type is inferred during the type of assignment. You cannot do the following:
var unknowType = null;
This will result in compilation error.
So why is var
important? If you look at most of the extension methods provided by Microsoft, they return IEnumerable<T>
. So in your code, every time you write a query, you will also be writing another class encapsulating it. This is not a good idea. So temporarily you can use it using the var
variable. Given that, you cannot pass var
as a function argument. So if you want to use the result of the query outside the function, you will have to define your type unless you are selecting the whole item.
Anonymous Type
Let's look at a different query this time:
var contacts = from c in customers
where c.State == "WA"
select new { c.Name, c.Phone };
If you look at the select
statement, here we are creating a new type altogether and we are not even specifying the type name. In this case, the compiler will create a new type which will have two public
fields in it and the type of the fields will be inferred from the source type.
Note since we have not mentioned the type name, we will not be able to reuse this type. Also the scope of this type is the function in which it is used. Once the compiler creates the type, it will initialize the values with the values of the source.
Object Initialization
Referring back to the same query:
var contacts = from c in customers
where c.State == "WA"
select new { c.Name, c.Phone };
In this case, the compiler created an anonymous type but it did not create its constructor which takes two parameters and then initializes its fields. So how do the fields get initialized?
Let's take another example which will make things more clear:
public class Point
{
private int x, y;
public int X { get { return x; } set { x = value; } }
public int Y { get { return y; } set { y = value; } }
}
Point a = new Point { X = 0, Y = 1 };
Now this code is as good as writing it as follows:
Point a = new Point();
a.X = 0;
a.Y = 1;
I guess the code speaks for itself.
Future Articles
- Part 2 of this series will discuss LINQ to SQL
- Part 3 of this series will discuss LINQ to XML
For more details, you can reach me at SumitkJain@hotmail.com.
History
- 27th August, 2007: Initial post