It’s a common practice by database developers to use local variables in stored procedures and scripts to place filter on the basis of these local variables. YES, these local variables can slow down your queries. Let’s prove it.
Create a new table and insert dummy rows.
USEAdventureWorks
GO
CREATE TABLE TempTable
(tempID UNIQUEIDENTIFIER,tempMonth INT,tempDateTime DATETIME )
GO
INSERT INTO TempTable (tempID, tempMonth, tempDateTime)
SELECT NEWID(),(CAST(100000*RAND() AS INT) % 12) + 1 ,GETDATE()
GO 100000
CREATE NONCLUSTERED INDEX[IX_tempDateTime] ON [dbo].[TempTable] ([tempDateTime] ASC) _
INCLUDE ( [tempID]) WITH ( ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO _
SET STATISTICS IO ON GO SELECT * FROM TempTable WHEREtempDateTime > '2012-07-10 03:18:01.640'
Table 'TempTable'. Scan count 1, logical reads 80, physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Check out its execution plan and index seeks properties. You can find that estimated rows are double to actual rows but that’s not a big difference to effect execution plan and resultantly optimizer has selected a proper plan to execute this query.
Query optimizer has estimated number of rows from its base statistics histogram i.e., EQ_ROWS + AVG_RANGE_ROWS (77 + 88.64286)
.
DBCC SHOW_STATISTICS ('dbo.TempTable', IX_tempDateTime)
Now, let’s modify our SELECT
query and use local variable and execute it. You will find that query optimizer has selected a different plan this time, a more costly plan. WHY??
DECLARE@RequiredDate DATETIME
SET@RequiredDate = '2012-07-10 03:18:01.640'
SELECT * FROM TempTable
WHEREtempDateTime >@RequiredDate
Table 'TempTable'. Scan count 1, logical reads 481, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Huge difference of estimated and actual number of rows clearly indicating that query optimizer was unable to properly estimate number of rows and with this wrong estimation, it has selected a more costly execution plan. Basically, Query Optimizer does not know the value of local variable at the time of optimization and resultantly, can’t use histogram of statistics. It behaves differently with inequality and equality operators.
In Case of Inequality Operator
In our case of inequality operator in query, query optimizer used a simple formula of 30% of total rows.
Estimated Rows =(Total Rows * 30)/100 = (100000*30)/100 = 30000
In Case of Equality Operator
DECLARE@RequiredDate DATETIME
SET@RequiredDate = '2012-07-10 03:18:01.640'
SELECT * FROM TempTable
WHEREtempDateTime =@RequiredDate
If equality operator is used with local variables, query optimizer gets estimated rows figure from a different formula, i.e., Density * Total Number of Table Rows. Execute the following query to get density value.
DBCC SHOW_STATISTICS ('dbo.TempTable', IX_tempDateTime)
- All Density =
0.0007358352
- Total Number of Rows in Table =
100000
- Estimated Rows = Density * Total Number =
0.0007358352 * 100000 = 73.5835
Drop table
when not required:
DROP TABLE TempTable