Introduction
Lately, I've been answering a lot of questions about tuning queries in SQL Server and checking that the developer is making the best use of the database features. As a result of this, I have decided to put together a series of articles on how to get the best out of your database development.
In this article, we are going to look at how to identify long running queries and what can be done to improve them.
In the next article, we will take a more in depth look at execution plans and indexes and how we can take what we have identified here further.
Before we start
Before we approach any code or Stored Procedures, we need to think about what we are trying to achieve. We have stated that we want to identify long running queries, but what do we mean by long running? This is highly subjective, and really depends on the purpose of the database and the complexity of the operation. A data warehouse will obviously have a much higher threshold for the length of time than a straightforward OLTP system. For the purposes of this article, I am going to assume that we have defined a long running query as a query that takes over 3 seconds to run.
Now that we have decided what the criteria that we are looking for are, how do we go about figuring out which queries actually take that long? This is where our new best friend, the SQL Profiler, comes into play. This tool can be invaluable in identifying security issues, bottlenecks, query hogs, and so on.
Note: For our examples, I am going to be using the SQL Server 2000 Profiler.
There are certain things that we need to keep in mind when profiling our database:
- Only select the data that needs collecting. The more data that we collect, the more resources are consumed, which slows down performance. We are only going to select the events and columns that we absolutely need.
- The profiler should run on a different computer than the one that we are monitoring as we don't want to affect the performance of the database server.
- We want to collect data over a typical period. It's no good profiling our application on a Saturday morning if our on-line day is typically 9 to 5 Monday to Friday.
- We may need to redefine our criteria for long running queries as we go on.
- There are times when it is not worth trying to improve the performance of a query, or it is not possible.
- Filtering information can help reduce the chaff.
Getting started
So, how do we set up SQL Profiler to capture the information that we need? Well, once we have started it up, we need to set up a Trace. To do this, we select File > New > Trace. This prompts us to select the database and credentials that we want to use to connect to the database. Note that we must connect as a user who is a member of the sysadmin server role.
Once we have connected to the database, we are presented with the Trace Properties dialog. It is here that we are going to choose the events, columns, and filters that we are going to use.
Quick tip: Using File > New > Trace Template, we can set up a template that can be used over and over again. Throughout this article, there are certain things that could be picked up and saved into a template to be reused.
Under the Events tab, we are going to select the following events:
- Stored Procedures > RPC:Completed
- TSQL > SQL:BatchCompleted
These two events tell us which procedures and direct queries need looking at.
In the Data Columns tab, we are going to capture the following columns only:
Under the Groups node:
Under the Columns node:
- EventClass
- DatabaseName
- TextData
- CPU
- Reads
- Writes
- NTUserName
- LoginName
- SPID
I find the NTUserName and LoginName identifiers are handy to identify who is performing ad-hoc queries so that we can ignore them when analyzing the data later.
Now we are going to set up the filters. To do this, under the Filters tab, we set up the following:
- Duration > Greater Than Or Equal > 3000 (3 seconds)
- Tick the Exclude system IDs box
Later on, we may consider adding other filters (such as restricting the logins that we are analyzing), but for now, this is sufficient.
In the General tab, we can choose whether or not we want to save the data to a file or to a SQL Server database. For our purposes, we will pick a file and not restrict its growth.
When we click Run, the profiler will start tracing the database(s) that we want to watch. We are going to let the profiler run for a while so that we get a decent amount of information that we can look at.
Analyzing the data
Assuming that we have long running queries, we now want to see what they are. At this point, it is common to have the same query duplicated throughout the trace, so we can probably reduce the number of queries that we need to look at quite significantly.
Again, we don't want to rush in to fixing things. We need to stop and consider what these queries are doing. For instance, if one of the queries is a 10 second query that is only run once every six months, then it doesn't really make any sense trying to optimize it. If it's a 10 second query that is run twice a minute, then it is a likely candidate for improvement.
The good news about the Profiler is that it displays the query that ran in the trace window, so we can copy this data over into the Query Analyzer and view the execution plan. Armed with the execution plan and tools like the Index Tuning Wizard, we can hopefully make improvements to the queries.
In the next article, we will take a more in depth look at execution plans and indexes and how we can take what we have identified here further.