Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server / SQL-Server-2008

Set Based and Procedural Approaches in Examples

4.11/5 (5 votes)
4 Jan 2010CPOL4 min read 15.3K  
Set based and procedural approaches in examples tested on SQL Server 2008

Introduction

In this article, I'll make a comparison of set based and procedural approaches using some simple examples. 

For running the scripts, you'll need AdventureWorks database that can be downloaded here. Below is the diagram of the tables that will be used in these examples:  

db_structure.jpg

Contact – 19972 rows
SalesOrderHeader – 31465 rows

The Task

Just for testing purposes, let’s say we need to get the list of all contacts and for each contact the sum of SubTotal column from SalesOrderHeader table. The resulting table should contain the following columns:

ContactID, LastName, FirstName, TotalSales

Testing

With getting ContactID, FirstName and LastName all is simple, we just query Contact table and get the results. Now how to get TotalSales information for each contact? If we follow the “procedural” approach, then one solution could be to write a scalar UDF and use it in the query.  Below is the UDF script:

SQL
CREATE FUNCTION dbo.GetContactTotalSales
(
	@contactID int
)
RETURNS money
AS
BEGIN
	DECLARE @contactTotalSales money;		

	SET @contactTotalSales = (SELECT Sum(SubTotal) _
		FROM Sales.SalesOrderHeader WHERE ContactID = @contactID);

	RETURN 	@contactTotalSales;
END 

Now the final query will look like this:

SQL
SELECT  
	c.ContactID,
	c.LastName,
	c.FirstName,
	dbo.GetContactTotalSales(c.ContactID) as TotalSales
FROM
	Person.Contact c 

To test each query, I used SQL Server Profiler. I took the standard template and just added to it the query execution plan. Also, before running each query, I was executing: 

SQL
DBCC DROPCLEANBUFFERS 

This cleans the buffers from the buffer pool. Otherwise second call will be much faster than the previous one because SQL Server would return the results from the cache. After running the query, I got the following results:

profiler.jpg

Below are the significations of these columns:

  • CPU - Amount of CPU time (in milliseconds) used by the event
  • Reads - Number of logical disk reads performed by the server on behalf of the event
  • Duration - Amount of time taken by the event ( in microseconds)

To understand how bad these results are, we need to compare them with the results of the “set based” variant. Below is the query script:

SQL
SELECT  
	c.ContactID,
	c.LastName,
	c.FirstName,
	ts.TotalSales as TotalSales
FROM
	Person.Contact c
LEFT OUTER JOIN
	(
		SELECT     
			soh.ContactID, SUM(soh.SubTotal) AS TotalSales
		FROM         
			Sales.SalesOrderHeader soh
		GROUP BY 
			soh.ContactID
	) as ts ON ts.ContactID = c.ContactID

After running the query, I got the following results:

  • CPU: 203
  • READS: 1273
  • DURATION: 787

If we compare the durations of “set based” and “procedural” queries, then we get that procedural query is about 159 times slower. That’s a huge difference!

Things are not always so bad with “procedural” approach. For small number of rows, the performance of the queries could be similar or even better when comparing to “set based” queries. If there is no way to write the query in “set based” form, then write it in “procedural” form but just make sure you don't return too many rows. This number should be determined separately for each particular case through testing.

There is also another way of writing the query that I would like to discuss. The inline query could be placed in the select part.

SQL
 SELECT  
	c.ContactID,
	c.LastName,
	c.FirstName,
	(
		SELECT     
			SUM(soh.SubTotal)
		FROM         
			Sales.SalesOrderHeader soh
		WHERE
			soh.ContactID = c.ContactID
	) as TotalSales
FROM
	Person.Contact c 

Many developers will consider this as another way of writing the query using “procedural” approach. However the results are showing the opposite:

  • CPU: 140
  • READS: 1273
  • DURATION: 794

As we can see, the results are almost the same as for “set based” query. If we compare the query plans, then we can see that the query plans are almost equal with an insignificant difference.

“Set based” query:

set_based_plan.jpg - Click to enlarge image

“inline query in the select” query:

inlineq_plan.jpg - Click to enlarge image

As you can see, the only difference is the position of the “compute scalar” operation. The conclusion I can make is that SQL Server is smart enough to handle such queries as “set based” queries. Anyway I don't recommend adding inline queries to the select because I noticed that when queries become more complex, SQL Server starts to handle them worse than equivalent “classic set based” queries.

Improve “Set Based” Queries with Views

One of the benefits of using “procedural” approach is that it makes the query more readable. It’s easy to move a complex query to a UDF and reuse it in other queries. On the other hand, “set based” queries tend to be more complex to write and read. A simple way to improve “set based” queries is to use views. Views allow you to do the same as UDFs (encapsulating SQL script) but without any overhead. So, if we create the following view:

SQL
CREATE VIEW dbo.ContactTotalSales
AS
	SELECT     
		ContactID, SUM(SubTotal) AS TotalSales
	FROM         
		Sales.SalesOrderHeader
	GROUP BY 
		ContactID 

then we can make our query look like this:

SQL
SELECT  
	c.ContactID,
	c.LastName,
	c.FirstName,
	ts.TotalSales
FROM
	Person.Contact c
LEFT OUTER JOIN
	ContactTotalSales ts ON ts.ContactID = c.ContactID 

If we'll compare the performance of this query, we will see that it’s equal to original “set based” query.

Conclusion

Always test your queries for performance and check their execution plan. Improve complex query structures using views. Use “procedural” approach very carefully and only if there is no other choice.

Thank you for reading. I hope this was useful to you.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)