(untagged)

SQL Server 2000 Paging and Sorting Using ROWCOUNT and SQL_VARIANT

DejaVudew

0.00/5 (No votes)

29 Sep 2007

An article discussing how to accomplish paging and sorting in Microsoft SQL Server 2000 without using concatenated SQL.

Purpose

The purpose is too provide an efficient method for sorting and paging large datasets on a Microsoft SQL Server 2000 database that does not require the use of concatenated SQL. Primarily, this is a proof-of-concept piece and you will need to do your own performance testing to see if it meets your goals.

Introduction

Ever since I heard about SQL injection attacks and the hope of improving a query's performance and maintainability by using the same query every time, I've been trying to avoid concatenated SQL. At the very least, static SQL tends to be easier to understand -- which generally results in fewer bugs. But, I always had a problem when trying to implement a paging solution that had dynamic sort values.

Then I ran into this article and noticed that the queries using ROWCOUNT tended to outperform other queries. But getting that to work for a dynamic sort value without concatenation tended to involve a temporary table, and that performed worse than using a CURSOR when you got past the first couple of pages. So I decided there had to be a way to do it...

The SQL

This stored procedure is built to run on Microsoft's Adventure Works sample database. I just picked a table that had a decent amount of rows. Without further ado, here's the SQL.

CREATE PROCEDURE dbo.up_GetSortedSalesOrdersByPageUsingRowset
    @orderedOnStart datetime,
    @orderedOnEnd datetime,
    @pageNumber int,
    @pageSize int,
    @sortExpression varchar(100),
    @sortOrder varchar(4),
    @virtualCount int OUTPUT
AS

    /*
        Make sure that the page number is at least 1
    */
    IF @pageNumber < 1
    BEGIN
        SET @pageNumber = 1
    END

    SELECT
        @virtualCount = COUNT(*)
    FROM
        Sales.SalesOrderHeader Header
    WHERE
        Header.[OrderDate] >= @orderedOnStart
        AND Header.[OrderDate] < @orderedOnEnd

    DECLARE @lastKeyValue numeric(18,0)
    DECLARE @lastAscendingSortValue SQL_Variant
    DECLARE @lastDescendingSortValue SQL_Variant

    DECLARE @numberToIgnore int

    SET @numberToIgnore = (@pageNumber-1) * @pageSize

    IF @numberToIgnore > 0
    BEGIN
        /*
        Get the last available sort data and unique key
        value from the last page.
        */
        SET ROWCOUNT @numberToIgnore

        SELECT
            @lastKeyValue = [UniqueValue],
            @lastAscendingSortValue = [AscendingSort],
            @lastDescendingSortValue = [DescendingSort]
        FROM
        (
            SELECT
                Header.[SalesOrderID] AS [UniqueValue],
                CASE
                    WHEN
                        UPPER(@sortOrder) = 'DESC'
                        AND UPPER(@sortExpression) = 'CUSTOMERID'
                    THEN
                        CONVERT(SQL_Variant, [CustomerID])
                    WHEN
                        UPPER(@sortOrder) = 'DESC'
                        AND UPPER(@sortExpression) = 'TOTALDUE'
                    THEN
                        CONVERT(SQL_Variant, [TotalDue])
                    WHEN
                        UPPER(@sortOrder) = 'DESC'
                        AND UPPER(@sortExpression) = 'ORDERDATE'
                    THEN
                        CONVERT(SQL_Variant, [OrderDate])
                    ELSE
                        NULL
                END AS [DescendingSort],
                CASE
                    WHEN
                        UPPER(@sortOrder) = 'ASC'
                        AND UPPER(@sortExpression) = 'CUSTOMERID'
                    THEN
                        CONVERT(SQL_Variant, [CustomerID])
                    WHEN
                        UPPER(@sortOrder) = 'ASC'
                        AND UPPER(@sortExpression) = 'TOTALDUE'
                    THEN
                        CONVERT(SQL_Variant, [TotalDue])
                    WHEN
                        UPPER(@sortOrder) = 'ASC'
                        AND UPPER(@sortExpression) = 'ORDERDATE'
                    THEN
                        CONVERT(SQL_Variant, [OrderDate])
                    ELSE
                        NULL
                END AS [AscendingSort]
            FROM
                Sales.SalesOrderHeader Header
            WHERE
                Header.[OrderDate] >= @orderedOnStart
                AND Header.[OrderDate] < @orderedOnEnd
            ) AS Derived
        ORDER BY
            [AscendingSort] ASC,
            [DescendingSort] DESC,
            [UniqueValue] ASC
    END

    /*
    Select the first @pageSize records that come after the last sort
    data/unique value from the last page. If this is the first page,
    just get the first @pageSize records.
    */

    SET ROWCOUNT @pageSize

    SELECT
        [SalesOrderID],
        [OrderDate],
        [TotalDue],
        [CustomerID]
    FROM
    (
        SELECT
            [SalesOrderID],
            [OrderDate],
            [TotalDue],
            [CustomerID],
            [SalesOrderID] As [UniqueValue],
            CASE
                WHEN
                    UPPER(@sortOrder) = 'DESC'
                    AND UPPER(@sortExpression) = 'CUSTOMERID'
                THEN
                    CONVERT(SQL_Variant, [CustomerID])
                WHEN
                    UPPER(@sortOrder) = 'DESC'
                    AND UPPER(@sortExpression) = 'TOTALDUE'
                THEN
                    CONVERT(SQL_Variant, [TotalDue])
                WHEN
                    UPPER(@sortOrder) = 'DESC'
                    AND UPPER(@sortExpression) = 'ORDERDATE'
                THEN
                    CONVERT(SQL_Variant, [OrderDate])
                ELSE
                    NULL
            END AS [DescendingSort],
            CASE
                WHEN
                    UPPER(@sortOrder) = 'ASC'
                    AND UPPER(@sortExpression) = 'CUSTOMERID'
                THEN
                    CONVERT(SQL_Variant, [CustomerID])
                WHEN
                    UPPER(@sortOrder) = 'ASC'
                    AND UPPER(@sortExpression) = 'TOTALDUE'
                THEN
                    CONVERT(SQL_Variant, [TotalDue])
                WHEN
                    UPPER(@sortOrder) = 'ASC'
                    AND UPPER(@sortExpression) = 'ORDERDATE'
                THEN
                    CONVERT(SQL_Variant, [OrderDate])
                ELSE
                    NULL
            END AS [AscendingSort]
        FROM
            Sales.SalesOrderHeader Header
        WHERE
            Header.[OrderDate] >= @orderedOnStart
            AND Header.[OrderDate] < @orderedOnEnd
    ) Derived
    WHERE
    (
        @lastAscendingSortValue IS NULL
        AND @lastDescendingSortValue IS NULL
        AND @lastKeyValue IS NULL
    )
    OR
    (
        (@lastAscendingSortValue IS NOT NULL)
        AND
        (
            ([AscendingSort] > @lastAscendingSortValue)
            OR
            (
                [AscendingSort] = @lastAscendingSortValue
                AND [UniqueValue] > @lastKeyValue
            )
        )
    )
    OR
    (
        (@lastDescendingSortValue IS NOT NULL)
        AND
        (
            ([DescendingSort] < @lastDescendingSortValue)
            OR
            (
                [DescendingSort] = @lastDescendingSortValue
                AND [UniqueValue] > @lastKeyValue
            )
        )
    )
    ORDER BY
        [AscendingSort] ASC,
        [DescendingSort] DESC,
        [SalesOrderID] ASC

    SET ROWCOUNT 0
GO

A Quick Caveat

This method requires that your data set has at least 1 unique value (or combination of values) in it per row. I think that tends to be most of what we use these for anyway.

The Explanation

What does it do? It gets a page of sales orders that are in a particular date range and sorted in a particular way.

How does it work? First, the paging. This is described elsewhere is better detail, but the idea behind using ROWCOUNT is that you ignore a specific series of rows. So, if I'm on the 4^th page and I have 10 records on a page, I want to ignore the first 40 rows. How do we do that? ROWCOUNT specifies the number of rows that will be returned, in our case 40, and we simply store unique information about the last row (row #40) and force the last select statement to return rows after #40.

As mentioned, this causes some difficulty when you add sorting. A common response to using this method has been, "Well, that works if I hardcode my sort value", or, "Well, that only works if my sort value is unique." So, enter SQL_VARIANT. SQL_VARIANT is interesting in that it preserves the data types, not unlike variant types in other languages like JavaScript and Visual Basic. So, we store your sort values and your unique value from the last row. It's important that we have both because what if I want to sort by a field with non-unique values, like [OrderDate]? In order to guarantee a sequence in SQL, we need to make sure that something unique about the row is included in the ORDER BY clause. This is the "magic" that gets us around the sort value uniqueness problem.

But what about the hardcoded sort values? Well, that's what those CASE statements are doing. If you look, you'll see that the values coming out of the CASE statements are just used in the ORDER BY. I guess there's a sense in which we've moved from hardcoding a single sort value to hardcoding the available sort values, but that's something I'm willing to live with. If you really need ad hoc queries, there are other methods which will probably suit your needs better. Since SQL_VARIANT preserves the types of the sort values and we're returning the columns we want to sort on in the derived query, the sort works just like it should. The IS NULL checks are there because if we're on the first page, those values should be NULL.

That's It?

You wanted more?!? Personally, this was just something that made me smack myself in the forehead and think, "OF COURSE! THAT'S how you do it." There are some things left to do, however.

Performance testing really needs to be done on larger sets of data. If anyone wants to, I'll gladly post the results (giving credit where credit is due, of course). I ran this procedure at home on SQL Server 2005 (because that's what I have) and it looked very speedy.

At any rate, I hope this helps some of you. Happy coding!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here