Introduction
This week, I decided that I wanted to know how to write queries in Linq. I am not talking about the Linq extension methods, which I have used extensively, but the query syntax which uses keywords such as “from
” and “select
”. I wasn’t sure if this would be a particularly useful skill, as I have always coped well enough using just the extension methods, however it bugged me a little that I had never really got to grips with Linq queries and wasn’t sure exactly how the syntax worked. I am comfortable writing queries in SQL so decided to create some SQL queries and learn how I would write the corresponding queries using Linq queries. This week’s blog post is essentially my notes from this exercise.
First things first, let’s get our definition out of the way. Linq is short for Language Integrated Query. It is a Microsoft .NET technology used for querying data. Whereas SQL can be used to update and delete data, Linq is just for querying data (although you can obviously update and delete query results using other .NET commands). In SQL, we query database tables, and in Linq we query .NET collections. It is slightly odd to compare SQL to Linq, as the two operate in fundamentally different environments, however I found it useful to help me understand the Linq query syntax.
Setting Up the Data
Now for some sample data.
All the SQL below can be found here, and all the C# here.
Let’s say we have two types of objects: Products
and Orders
. An Order
is associated with one or more Products
.
In C#, our classes look like this:
public class Product
{
public int Id {get; set;}
public string Name {get; set;}
public string Description {get; set;}
public decimal Price {get; set;}
public int StockLevel {get; set;}
}
public class Order
{
public int Id {get; set;}
public IList ProductIds {get; set;}
}
…or in SQL, we have tables defined like this:
CREATE TABLE Products
(
Id INT,
Name VARCHAR(50),
Description VARCHAR(2000),
Price DECIMAL,
StockLevel INT
)
CREATE TABLE Orders
(
Id INT,
DatePlaced DATETIME
)
CREATE TABLE OrderProducts
(
Id INT,
OrderId INT,
ProductId INT
)
Notice how because we have a many-to-many relationship, we need to define 3 tables in SQL, but need only define 2 entities in C#.
Now for the actual data.
C#
public static IList<Product> GetProducts()
{
var products = new List<Product>();
products.Add(new Product
{
Id = 1,
Name = "Sausages",
Description = "Succulent pork sausages, from the finest farms in the land",
Price = 4.99M,
StockLevel = 12
});
products.Add(new Product
{
Id = 2,
Name = "Bacon",
Description = "Delicious bacon, fry it or grill it!",
Price = 3.50M,
StockLevel = 5
});
products.Add(new Product
{
Id = 3,
Name = "Chicken Fillets",
Description = "Our chickens are treated well, and our fillets will treat you well",
Price = 8.99M,
StockLevel = 4
});
products.Add(new Product
{
Id = 4,
Name = "Fishcakes",
Description = "If you love fish and you love cakes then you'll love our fish cakes",
Price = 3.99M,
StockLevel = 22
});
products.Add(new Product
{
Id = 5,
Name = "Lamb Chops",
Description = "Lovely with mint sauce",
Price = 12.00M,
StockLevel = 0
});
return products;
}
public static IList<Orders> GetOrders()
{
var orders = new List<Order>();
orders.Add(new Order
{
Id = 1,
ProductIds = new List(){ 1, 4 }
});
orders.Add(new Order
{
Id = 2,
ProductIds = new List(){ 2, 3, 5 }
});
orders.Add(new Order
{
Id = 3,
ProductIds = new List(){ 3, 1 }
});
orders.Add(new Order
{
Id = 4,
ProductIds = new List(){ 4 }
});
orders.Add(new Order
{
Id = 5,
ProductIds = new List(){ 2, 4, 3, 1 }
});
return orders;
}
SQL
INSERT INTO Products(Id, Name, Description, Price, StockLevel)
VALUES (1, 'Sausages', 'Succulent pork sausages, from the finest farms in the land', 4.99, 12)
INSERT INTO Products(Id, Name, Description, Price, StockLevel)
VALUES (2, 'Bacon', 'Delicious bacon, fry it or grill it!', 3.50, 5)
INSERT INTO Products(Id, Name, Description, Price, StockLevel)
VALUES (3, 'Chicken Fillets', 'Our chickens are treated well, _
and our fillets will treat you well', 8.99, 4)
INSERT INTO Products(Id, Name, Description, Price, StockLevel)
VALUES (4, 'Fishcakes', _
'If you love fish and you love cakes then you''ll love our fish cakes', 3.99, 22)
INSERT INTO Products(Id, Name, Description, Price, StockLevel)
VALUES (5, 'Lamb Chops', 'Lovely with mint sauce', 12.00, 0)
INSERT INTO Orders (Id, DatePlaced) VALUES (1, '1 January 2015')
INSERT INTO Orders (Id, DatePlaced) VALUES (2, '1 February 2015')
INSERT INTO Orders (Id, DatePlaced) VALUES (3, '12 February 2015')
INSERT INTO Orders (Id, DatePlaced) VALUES (4, '29 March 2015')
INSERT INTO Orders (Id, DatePlaced) VALUES (5, '1 July 2015')
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (1, 1, 4)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (2, 1, 1)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (3, 2, 2)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (4, 2, 3)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (5, 2, 5)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (6, 3, 3)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (7, 3, 1)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (8, 4, 4)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (9, 5, 2)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (10, 5, 4)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (11, 5, 3)
INSERT INTO OrderProducts(Id, OrderId, ProductId) VALUES (12, 5, 1)
Notice how I have used a ProductIds
collection in the Order
class, rather than a Products
collection. This is purely to allow the exploration of the join
capabilities of Linq.
Let’s now look at some of the most common use cases for querying our data.
Selecting Everything
SQL
SELECT * FROM Products
C#
var allProducts = from p in products
select p;
Nothing too interesting here. In Linq, the select
part comes at the end, rather than the beginning of the query. This feels less intuitive to me as the whole query sounds less like a natural English sentence than its SQL equivalent.
Selecting Specific Columns
SQL
SELECT Id, Name FROM Products
C#
var idAndNameOnly = from p in products
select new { p.Id, p.Name };
This case shows some of the inherent differences between Linq and SQL. Linq is built with object-orientation in mind. In SQL, you can easily specify which columns you want to return, as your result is always a dataset
. Because Linq is a part of .NET, which is an object-oriented framework, its queries return collections of objects (and the collections themselves are also objects). So if all we want are the Product Ids and Names, we would have to do something strange such as the anonymous type notation used here. In reality, it is very unlikely that you would use C# in this way, you would instead return the complete Product
objects and just access whichever properties you needed. This could however present performance problems if the objects being returned are large and complex, and you only need to access a property or two.
When we select something other than the actual objects being queried, this is called a projection.
Filtering (WHERE)
SQL
SELECT * FROM Products WHERE StockLevel < 5
C#
var lowStock = from p in products
where p.StockLevel < 5
select p;
Straightforward enough.
Ordering
SQL
SELECT * FROM Products ORDER BY Price DESC
C#
var byPrice = from p in products
orderby p.Price descending
select p;
Again, not much difference here.
Aggregate Functions
SQL
SELECT SUM(Price) FROM Products
SELECT COUNT(*) FROM Products
SELECT MIN(StockLevel) FROM Products
C#
var priceSum = products.Sum(p => p.Price);
var productCount = products.Count();
var minStockLevel = products.Min(p => p.StockLevel);
Regarding aggregate functions, I was slightly surprised to discover that there is no way to achieve the desired result without using extension methods. This is because statements built purely from the query syntax always return a collection, and since aggregate functions return a single value, an extension method is required. In SQL, queries always return a dataset
. If your SQL query returns a single value, then it actually returns a dataset
consisting of one column and one row.
Joins
SQL
SELECT * FROM Products p
INNER JOIN OrderProducts op
ON p.Id = op.ProductId
INNER JOIN Orders o
ON o.Id = op.OrderId
C#
var orderProducts = from o in orders
select new
{
Order = o,
Products = from p in products
join pid in o.ProductIds
on p.Id equals pid
select p
};
Because of the many-to-many relationship between Orders
and Products
, we need to create an inner query in order to obtain all Products
related to an Order
. When we use the join
keyword in Linq, we must use it with an equals
operator.
Selecting Unique Values (Distinct)
SQL
SELECT DISTINCT Price FROM Products
C#
var distinctPrices = (from p in products
select p.Price).Distinct();
Again, we need to use an extension method in Linq to retrieve distinct values.
Selecting the First / Top n Results
SQL
SELECT TOP 3 * FROM Products ORDER BY Price DESC
C#
var top3Expensive = (from p in products
orderby p.Price descending
select p).Take(3);
Finally, to retrieve the “top” n results only, we again need to use an extension method.
Conclusions
Having now learnt the basics of the Linq query syntax, I must admit I am not a fan. Firstly, I find the Linq syntax to be less intuitive than SQL. It does not seem to match the structure of English sentences well enough. Secondly, it feels unnatural to write this kind of code in C#. The Linq extension methods are not only easier to read and write, but also more intuitively C#-shaped. Off the top of my head, I can’t think of any scenario which would not be supported by extension methods.
Here are all the queries above written using extension methods only:
var allProductsExt = products.Select(p => p);
var idAndNameOnlyExt = products.Select(p => new { p.Id, p.Name });
var lowStockExt = products.Where(p => p.StockLevel < 5);
var byPriceExt = products.OrderByDescending(p => p.Price);
var orderProductsExt = orders.Select(o => new
{ Order = o, Products = products.Where(p => o.ProductIds.Contains(p.Id))});
var distinctPricesExt = products.Select(p => p.Price).Distinct();
var top3ExpensiveExt = products.OrderByDescending(p => p.Price).Take(3);
Consequently, it may well be the case that this is the last time I will ever write a query using Linq query syntax. Still, it was fun to explore.
The post From SQL to Linq appeared first on The Proactive Programmer.