LINQ is a great way to quickly iterate and filter lists, whether it’s a list of data values or objects. But what is LINQ and what are the fundamentals of this framework?
Introduction
LINQ has been around since 2007 and is growing in popularity. It’s a great way to quickly iterate and filter lists, whether it’s a list of data values or objects. If you use Entity Framework, chances are you are using LINQ. But what is LINQ and what are the fundamentals of this framework?
What is LINQ?
LINQ stands for Language Integrated Query and is a replacement for the good old queries. We used to write queries in string
s, making them typo-sensitive, not easy to change, no ability to IntelliSense, and they are just one big string
. LINQ solves these ‘problems’.
LINQ basically exists of multiple extensions that you can use on different types. But we mostly use LINQ on lists (IEnumerable, List<>
, List
, array, IQueryable, Dictionary<>
, etc.)
It is possible to use LINQ for different data sources:
- objects
- ADO
- XML
- Entity Framework
- SQL Database
- Other, as long as you use
IQueryable
or IEnumerable
LINQ comes in two variants: Lambda expressions and query expressions. The query expressions look the most like SQL statements. Performance-wise, there is no difference between them. It’s more about how you want to use them and what your preference is.
Lambda Expressions vs Query Expressions
As I said; there is no difference between lambda and query expressions if you look at the performance. But there is a big difference in syntax. Look at the code below:
internal class LinqDemo
{
private List<string> movieTitles =
new() { "The Matrix", "Shrek", "Inception", "Ice Age", "John Wick" };
public List<string> Search(string query)
{
var resultLambda = movieTitles.Where
(x => x.Contains(query, StringComparison.CurrentCultureIgnoreCase));
var resultQuery = from titles in movieTitles
where titles.Contains(query)
select titles;
return null;
}
}
Let's take a closer look:
var resultLambda = movieTitles.Where
(x => x.Contains(query, StringComparison.CurrentCultureIgnoreCase));
The line above is the lambda expression. It has an extension on the list and an expression body.
var resultQuery = from titles in movieTitles where titles.Contains(query) select titles;
This is a query expression. If you are familiar with TSQL, this piece of code should look familiar.
Both lines of code return the same result, namely an IEnumerable
. The only difference is the syntax. Personally, I like the lambda more. Purely because it’s somehow easier for me to write. I only use the query expressions when I need to join lists.
So, lambda:
- Creates an anonymous function
- The characteristic is the lambda operator.
- Has methods that execute the lambda and reserve memory
Query:
- Looks like TSQL
- Recognizable
- Is transcribed to Lambda
- Is only executed when read or used
There is no answer to which one is better because it’s a personal taste. For me, lambda works better. So the next examples will be given with lambda.
Lambda Explained
Let’s look at the previous code:
var resultLambda = movieTitles.Where(x => x.Contains
(query, StringComparison.CurrentCultureIgnoreCase));
What is happening here? Well, in short: resultLamba
contains all the movies that contain a value, which is stored in the query. Let’s say the query contains ‘a
’, then only The Matrix and Ice Age will be placed in resultQuery
. The Where
method is a method that only returns those values from a list that match the input sequence (x.Contains(….)
, where x
represents an item from the list).
Another example with the Where
:
List<Movie> movieTitles = new()
{
new()
{
Id = 1,
Title = "The Matrix",
HasBeenSeen = true
},
new()
{
Id = 2,
Title = "Shrek",
HasBeenSeen = true
},
new()
{
Id = 3,
Title = "Inception",
HasBeenSeen = true
},
new()
{
Id = 4,
Title = "Ice Age",
},
new()
{
Id = 5,
Title = "John Wick"
}
};
var seenMovies = movieTitles.Where(x => x.HasBeenSeen);
foreach (var item in seenMovies)
{
Console.WriteLine(item.Title);
}
public class Movie
{
public int Id { get; set; }
public string Title { get; set; }
public bool HasBeenSeen { get; set; }
}
I’ve created a list of movies, containing an Id, title, and a boolean HasBeenSeen
, which states if I have seen the movie already (true
) or not (false
).
Let's zoom in on the following line:
var seenMovies = movieTitles.Where(x => x.HasBeenSeen);
Here, I use the Where
method to filter out the movies that have been seen. X
represents a movie from the list movieTitles
.
The Where
-method is basically a short version of the For
-Each
loop. I could rewrite that line to not use a lambda expression, but a simple for
-each
:
List<Movie> seenMovies = new();
foreach (var movie in movieTitles)
{
if (movie.HasBeenSeen)
seenMovies.Add(movie);
}
foreach (var item in seenMovies)
{
Console.WriteLine(item.Title);
}
As you can see, LINQ does not only work great on lists, but it also makes the code smaller.
Most used LINQ-statements
Here is a small list of the most used LINQ statements.
Where
Filters a list by a predicate. See the examples above.
movieTitles.Where(x => x.HasBeenSeen);
Select
Select fields or properties you want to return and place them in a new (anonymous) object. Or return a list of a single properties.
var seenMovies = movieTitles.Select(x => new
{
x.Title,
NeedToSee = x.HasBeenSeen,
});
var seenMovies = movieTitles.Select(x => x.Title);
Single(OrDefault)
Returns one element, filtered by a predicate. If no element is found, an exception will be thrown. To avoid the exception, use the SingleOrDefault
. If the element is not found, the return value will be NULL
. If there is more than one element, an exception will be thrown, even when using SingleOrDefault
.
If a single element is an int
or a Boolean, the default value will not be NULL
. It will be 0
or false
since integers and Booleans can’t be NULL
.
var found = movieTitles.SingleOrDefault(x => x.HasBeenSeen);
var found = movieTitles.SingleOrDefault(x => x.Title.Contains("z"));
var found = movieTitles.SingleOrDefault
(x => x.Title.Equals("The Matrix"));
Any
The any
is used to check if a certain filter is true
or not. For example: you want to check if there are movies that haven’t been seen yet. Instead of using the Where
and check if the number of items is more than 1, you can use the Any
method.
if(movieTitles.Any(x=> !x.HasBeenSeen))
{
Console.WriteLine("There are still movies to be seen.");
}
Note the exclamation mark (!) on line 1. This means the HasBeenSeen
has to be false
.
OrderBy(Desc)
Orders a list on a specific property. There are two variants:
OrderBy
Orders from A-Z or 0-9 OrderByDesc
Orders from Z-A or 9-0
The OrderBy(Desc)
returns a new object of the type IOrderedQueryable
.
var ordered = movieTitles.OrderBy(x => x.Title);
First(OrDefault)/Last(OrDefault)
The First-statement returns the very first item in the list. If the list is empty, it will throw an exception. To avoid this exception, you can use the FirstOrDefault
, which has the same idea as the SingleOrDefault
.
The Last-statement returns the very last item in the list. If the list is empty, it will throw an exception. To avoid this exception, you can use the LastOrDefault
, which has the same idea as the SingleOrDefault
.
var lastItem = movieTitles.Last();
var firstItem = movieTitles.First();
var firstItemWithPredicate = movieTitles.First(x => x.HasBeenSeen);
The last line returns the first item in the list after filtering on the HasBeenSeen
, which only renders the movies that have been seen.
Chaining LINQ-statements
One of the things I love the most about LINQ is that you can chain the methods. Meaning you can put all the LINQ statements behind each other and you rarely have to create new variables. An example:
Movie first = movieTitles.OrderBy(x => x.Title).Where(x => x.Id > 3).First();
In the example above, I combined multiple LINQ statements in one row.
IEnumerable vs List
Ah, a question I had so many times when having an interview for a new job:
“Can you tell me the differences between IEnumerable
and a List
?”
Usually, I just said that one is an interface and the other is an object… I am not wrong. But it’s not what they wanted to hear.
If you work or starting to work, with LINQ, you should know the difference a bit more. Especially the reason why LINQ is returning IEnumerable
values.
IEnumerable
isn’t executed yet. List()
executes the query on the IEnumerable
. - You can keep filtering on an
IEnumerable
without executing the filter. IEnumerable
provides a deferred execution
The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated either by calling its GetEnumerator method directly or by using foreach in Visual C# or For Each in Visual Basic.
So, basically, an IEnumerable
stores the information that will be executed later. But … why? Let’s take a look at a database.
LINQ and Entity Framework
If you are using Entity Framework, chances are you get data from the entities, filter them, or do other stuff via LINQ. Imagine having a database with a table of movies. That table contains 100.000 movies. Say that only 10 movies have “The” in the title… Do you get 100.000 movies from the database and then filter them? It would be better to filter in a SQL query like this:
SELECT * FROM Movies WHERE Title like '%the%'
But we are using Entity Framework. Filtering on “the” will be done with the Where
-statement and looks like this:
DataContext dbContext = new();
var seenMovies = dbContext.Movies
.Where(x => x.Title.Contains("the"));
Let’s start the application and debug the “seenMovies
”. Note that the seenMovies
variable is still an IEnumerable
.
The type of seenMovies
is EntityQueryable
, which inherits IEnumerable
. The property “Results View” has some sort of warning:
Expanding the Results View will enumerate the IEnumerable.
This means that if you expand it, the query of the IEnumerable
will be executed on the (SQL) server.
Now, let’s execute the same code, but with one small change:
DataContext dbContext = new();
var seenMovies = dbContext.Movies
.Where(x => x.Title.Contains("the"))
.ToList();
The result is different: You have a list of movies now with one movie (which is expected). The query of the IEnumerable
has been executed and results in a List<Movie>
.
So, what’s the big advantage of executing the query at a later stage? Well, you can implement multiple LINQ statements before executing. Let’s say we create a method with an extended filter. The filter looks like this:
public class Filter
{
public string query { get; set; }
public bool? hasBeenSeen { get; set; }
public string OrderBy { get;set; }
}
All properties are optional. If the query is empty, don’t filter on the query. If hasBeenSeen
is NULL
, show seen and not seen movies. And now the implementation of the method:
public List<Movie> Filter(Filter filter)
{
IEnumerable<Movie> movies = dbContext.Movies;
if (!string.IsNullOrEmpty(filter.query))
movies = movies.Where(x => x.Title.Contains
(filter.query, StringComparison.CurrentCultureIgnoreCase));
if (filter.hasBeenSeen.HasValue)
movies = movies.Where(x => x.HasBeenSeen == filter.hasBeenSeen.Value);
return movies.ToList();
}
First, I get the movies from the context (dbContext.Movies
). Well, not really. I make a link (not LINQ) to the movies. Then I check if the query is filled or not. If it is, I add the Where
to movies
. After that, I check if the hasBeenSeen
has a value. If so, I add another Where
to movies
.
Now, I did three things with LINQ on an entity, but the data is not retrieved from the actual database. That only happens on last line (return movies.ToList()
).
If you use a profiler on the database, put a breakpoint on the last line (return movies.ToList()
), and start the application, you will notice that nothing is executed on the database, until you pass that line.
Conclusion
Well, that covers most of the LINQ fundamentals. I hope you notice you can use it fairly easily. There is plenty of documentation online that can help you achieve your LINQ challenges. If you want more information about the topic, check out Microsoft Learn.
History
- 16th December, 2022: Initial version