Paging in SQL Server 2005

January 10, 2008

By David Beahm

Introduction

Developers and database administrators have long debated methods for paging recordset results from Microsoft SQL Server, trying to balance ease of use with performance. The simplest methods were less efficient because they retrieved entire datasets from SQL Server before eliminating records which were not to be included, while the best-performing methods handled all paging on the server with more complex scripting. The ROW_NUMBER() function introduced in SQL Server 2005 provides an efficient way to limit results relatively easily.

Paging Efficiency

In order to scale well, most applications only work with a portion of the available data at a given time. Web-based data maintenance applications are the most common example of this, and several data-bindable ASP.NET classes (such as GridView and Datagrid) have built-in support for paging results. While it is possible to handle paging within the web page code, this may require transferring all of the data from the database server to the web server every time the control is updated. To improve performance and efficiency, data which will not be used should be eliminated from processing as early as possible.

Paging Methods

Many popular databases offer functions allowing you to limit which rows are returned for a given query based upon their position within the record set. For example, MySQL provides the LIMIT qualifier, which takes two parameters. The first LIMIT parameter specifies which (zero-based) row number will be the first record returned, and the second parameter specifies the maximum number of records returned. The query:

SELECT * FROM table LIMIT 20,13

…will return the 20^th through the 32^nd records — assuming at least 33 records are available to return. If fewer than 33 records are available, the query will return all records from record 20 on. If fewer than 20 records are available, none will be returned.

SQL Server does not have this functionality, however the 2005 release does have a number of other new tricks. For instance, support for CLR procedures means it is possible to use existing paging methods to write VB.NET or C# code that would execute within the SQL Server environment. Unfortunately, CLR procedures are not as efficient as native Transact SQL. To ensure best performance, queries should still be written in TSQL whenever practical.

Using ROW_NUMBER()

TSQL in the 2005 release includes the ROW_NUMBER() function, which adds an integer field to each record with the record’s ordinal result set number. Stated more simply, it adds the record’s position within the result set as an additional field so that the first record has a 1, the second a 2, etc. This may appear to be of little value, however by using nested queries we can use this to our advantage.

To demonstrate ROW_NUMBER() and to explore how the paging solution works, create a simple salary table and populate it with random data using the following commands:

CREATE TABLE [dbo].[Salaries]( [person] [nvarchar](50) NOT NULL, [income] [money] NOT NULL, CONSTRAINT [PK_salaries] PRIMARY KEY CLUSTERED( [person] ASC


)) ON [PRIMARY]

GO
INSERT INTO Salaries VALUES ('Joe', '28000')

INSERT INTO Salaries VALUES ('Sue', '96000')

INSERT INTO Salaries VALUES ('Michael', '45000')

INSERT INTO Salaries VALUES ('John', '67000')

INSERT INTO Salaries VALUES ('Ralph', '18000')

INSERT INTO Salaries VALUES ('Karen', '73000')
INSERT INTO Salaries VALUES ('Waldo', '47000')

INSERT INTO Salaries VALUES ('Eva', '51000')

INSERT INTO Salaries VALUES ('Emerson', '84000')

INSERT INTO Salaries VALUES ('Stanley', '59000')

INSERT INTO Salaries VALUES ('Jorge', '48000')

INSERT INTO Salaries VALUES ('Constance', '51000')

INSERT INTO Salaries VALUES ('Amelia', '36000')

INSERT INTO Salaries VALUES ('Anna', '49000')

INSERT INTO Salaries VALUES ('Danielle', '68000')

INSERT INTO Salaries VALUES ('Stephanie', '47000') INSERT INTO Salaries VALUES ('Elizabeth', '23000')

The ROW_NUMBER() function has no parameters – it simply adds the row number to each record in the result set. To ensure the numbering is consistent, however, SQL Server needs to know how to sort the data. Because of this, ROW_NUMBER() must immediately be followed by the OVER() function. OVER() has one required parameter, which is an ORDER BY clause. The basic syntax for querying the Salaries table is:

SELECT ROW_NUMBER() OVER(ORDER BY person), person, income FROM Salaries

This returns the following result:

(No column name)	person	income
1	Amelia	36000.00
2	Anna	49000.00
3	Constance	51000.00
4	Danielle	68000.00
5	Elizabeth	23000.00
6	Emerson	84000.00
7	Eva	51000.00
8	Joe	28000.00
9	John	67000.00
10	Jorge	48000.00
11	Karen	73000.00
12	Michael	45000.00
13	Ralph	18000.00
14	Stanley	59000.00
15	Stephanie	47000.00
16	Sue	96000.00
17	Waldo	47000.00

The Salaries data now appears sorted by person, and it has an extra column indicating each record’s position within the results.

If for any reason you wanted the results to display in a different order than they were numbered in, you can include a different ORDER BY clause as part of the normal SELECT syntax:

SELECT ROW_NUMBER() OVER(ORDER BY person), person, income

FROM Salaries ORDER BY income

This returns the following result:

(No column name)	person	income
13	Ralph	18000.00
5	Elizabeth	23000.00
8	Joe	28000.00
1	Amelia	36000.00
12	Michael	45000.00
15	Stephanie	47000.00
17	Waldo	47000.00
10	Jorge	48000.00
2	Anna	49000.00
3	Constance	51000.00
7	Eva	51000.00
14	Stanley	59000.00
9	John	67000.00
4	Danielle	68000.00
11	Karen	73000.00
6	Emerson	84000.00
16	Sue	96000.00

If we want to limit the results displayed to a certain range, we need to nest this SELECT inside another one and provide a name for the ROW_NUMBER() column. To limit our results to records 5 through 9, we can use the following query:

SELECT * FROM (SELECT ROW_NUMBER() OVER(ORDER BY person) AS rownum, person, income FROM Salaries) AS Salaries1 WHERE rownum >= 5 AND rownum <= 9

This returns the following result:

rownum	person	income
5	Elizabeth	23000.00
6	Emerson	84000.00
7	Eva	51000.00
8	Joe	28000.00
9	John	67000.00

Again, we can change the sort order by adding an ORDER BY clause. This is most easily accomplished by using the outer SELECT statement:

SELECT * FROM (SELECT ROW_NUMBER() OVER(ORDER BY person) AS

rownum, person, income FROM Salaries) AS Salaries1 WHERE rownum >= 5 AND rownum <= 9 ORDER BY income

This returns the following result:

rownum	person	income
5	Elizabeth	23000.00
8	Joe	28000.00
7	Eva	51000.00
9	John	67000.00
6	Emerson	84000.00

If we want to support the same type of arguments that MySQL’s LIMIT() supports, we can create a stored procedure that accepts a beginning point and a maximum number of records to return. ROW_NUMBER requires that the data be sorted, so we will also have a required parameter for the ORDER BY clause. Execute the following statement to create a new stored procedure:

CREATE PROCEDURE [dbo].[pageSalaries] @start int = 1 ,@maxct int = 5 ,@sort nvarchar(200) AS


  SET NOCOUNT ON

  DECLARE

    @STMT nvarchar(max),    -- SQL statement to execute

    @ubound int
  IF @start < 1 SET @start = 1

  IF @maxct < 1 SET @maxct = 1
  SET @ubound = @start + @maxct

  SET @STMT =  ' SELECT  person, income

                FROM (

                      SELECT  ROW_NUMBER() OVER(ORDER BY ' + @sort + ') AS row, * 

                      FROM    Salaries

                     ) AS tbl

                WHERE  row >= ' + CONVERT(varchar(9), @start) + ' AND

row < ' + CONVERT(varchar(9), @ubound) EXEC (@STMT) -- return requested records

The pageSalaries procedure begins with SET NOCOUNT ON to disable the record count message (a common step for optimizing query performance). We then declare two necessary variables, @STMT and @ubound. Because we want to be able to change what ORDER BY argument is used, we need to dynamically generate our query statement by storing it in @STMT. The next lines ensure that only positive numbers are used for the starting position and maximum size, then calculate the range of ROW_NUMBER() values being requested. (If we wanted to be zero-based like MySQL’s LIMIT, we could do so with a few minor tweaks.) Once the dynamic SQL command has been strung together, it is executed so that the results are returned.

Execute the following statement to test the stored procedure:

pageSalaries 4, 7, 'income'

This returns the following result:

person	income
Amelia	36000.00
Michael	45000.00
Stephanie	47000.00
Waldo	47000.00
Jorge	48000.00
Anna	49000.00
Constance	51000.00

If we execute:

pageSalaries 13, 7, 'income'

we receive back:

person	income
John	67000.00
Danielle	68000.00
Karen	73000.00
Emerson	84000.00
Sue	96000.00

… because the query goes beyond the number of records available.

Taking this one step further, we can make a stored procedure that does a more general form of paging. In fact, it can be generalized to the point that it can be used to return any collection of fields, in any order, with any filtering clause. To create this wunderkind marvel, execute the following command:

CREATE PROCEDURE [dbo].[utilPAGE] @datasrc nvarchar(200) ,@orderBy nvarchar(200)


 ,@fieldlist  nvarchar(200) = '*'

 ,@filter     nvarchar(200) = ''

 ,@pageNum    int = 1

 ,@pageSize   int = NULL

AS

  SET NOCOUNT ON

  DECLARE

     @STMT nvarchar(max)         -- SQL to execute
    ,@recct int                  -- total # of records (for GridView paging interface)
  IF LTRIM(RTRIM(@filter)) = '' SET @filter = '1 = 1'

  IF @pageSize IS NULL BEGIN

    SET @STMT =  'SELECT   ' + @fieldlist + 

                 'FROM     ' + @datasrc +

                 'WHERE    ' + @filter + 
                 'ORDER BY ' + @orderBy

    EXEC (@STMT)                 -- return requested records 

  END ELSE BEGIN

    SET @STMT =  'SELECT   @recct = COUNT(*)

                  FROM     ' + @datasrc + '

                  WHERE    ' + @filter
    EXEC sp_executeSQL @STMT, @params = N'@recct INT OUTPUT', @recct = @recct OUTPUT

    SELECT @recct AS recct       -- return the total # of records
    DECLARE

      @lbound int,

      @ubound int
    SET @pageNum = ABS(@pageNum)

    SET @pageSize = ABS(@pageSize)
    IF @pageNum < 1 SET @pageNum = 1

    IF @pageSize < 1 SET @pageSize = 1

    SET @lbound = ((@pageNum - 1) * @pageSize)

    SET @ubound = @lbound + @pageSize + 1

    IF @lbound >= @recct BEGIN

      SET @ubound = @recct + 1
      SET @lbound = @ubound - (@pageSize + 1) -- return the last page of records if

                                              -- no records would be on the

                                              -- specified page

    END

    SET @STMT =  'SELECT  ' + @fieldlist + '

                  FROM    (

                            SELECT  ROW_NUMBER() OVER(ORDER BY ' + @orderBy + ') AS row, *
                            FROM    ' + @datasrc + '

                            WHERE   ' + @filter + '

                          ) AS tbl

                  WHERE

                          row > ' + CONVERT(varchar(9), @lbound) + ' AND

                          row < ' + CONVERT(varchar(9), @ubound)

EXEC (@STMT) -- return requested records END

You may receive the following error message from SQL Server, which you can confidently ignore:

Cannot add rows to sys.sql_dependencies for the stored procedure because it depends on the missing table 'sp_executeSQL'. The stored procedure will still be created; however, it cannot be successfully executed until the table exists.

The utilPage procedure accepts 6 parameters:

@datasrc		– the table (or stored procedure, etc.) name
@orderBy		– the ORDER BY clause
@fieldlis		– the fields to return (including calculated expressions)
@filter		– the WHERE clause
@pageNum		– the page to return (must be greater than or equal to one)
@pageSize		– the number of records per page

The stored procedure needs the name of a data source to query against (such as a table) and one or more fields to sort by (since OVER() requires an ORDER BY clause). If @filter is blank (the default), it will be set to "1 = 1" as a simple way to select all records. If @pageSize is not supplied, the query will run without paging and will not return a record count.

If, however, @pageSize is supplied, a version of the query is executed to get the total number of records. In order to have this record count available within the procedure and as a returned value, we use sp_executeSQL to support executing the statement while returning an output parameter. The record count is used to prevent returning empty results when possible, and to support paging interfaces that calculate the number of pages available (such as GridView). If we were calling this stored procedure to populate a GridView, we would return @recct as a ReturnValue parameter instead of using a result set, but we will use a result set for demonstration purposes.

The procedure calculates what the actual record positions will be for the requested page. Rather than allow the query to fail, there are safety checks ensuring that @pageSize and @pageNum are greater than zero, and that the result set will not be empty. If the specified page is out of range, this procedure will return the last possible page of records. This is helpful if a user changes more than one setting before refreshing their data, or if a significant amount of data is deleted between requests.

The remainder of the procedure is virtually identical to the pageSalaries procedure. To test the utilPAGE stored procedure, execute the following statement:

utilPAGE 'Salaries', 'person', '*', 'income > 1000', 2, 4

This returns the following two result sets:

recct

17

row	person	income
5	Elizabeth	23000
6	Emerson	84000
7	Eva	51000
8	Joe	28000

If we execute:

utilPAGE 'Salaries', 'person', 'person, income', '', 13, 3

…we receive back:

recct

17

person	income
Stephanie	47000
Sue	96000
Waldo	47000

Even though the request should be for records 36 through 38 – far outside of what is available – the procedure returns the last available page of records. In contrast, requesting the third page with seven records per page using:

utilPAGE 'Salaries', 'person', 'person, income', '', 3, 7

…returns the last three records, as the page is not completely out of bounds:

person	income
Stephanie	47000
Sue	96000
Waldo	47000

All of these examples are based on simple single-table queries, which may not reflect what you need in the real world. While the utilPAGE procedure does not support ad-hoc JOINs, it does work with SQL Views. If you want paging support for multi-table queries, you should create a View (with all of the necessary JOINs) to use as the data source. Using a View follows good design practices as it ensures that your Joins are performed consistently, allows easier ad-hoc querying from the command line, and is much easier to troubleshoot than a stored procedure’s dynamic SELECT statement logic.

Conclusion

While SQL Server does not have as simple a method for paging results as some other databases, features introduced in the 2005 release have made it possible to page results efficiently more easily than ever before. In the next article in this series, we will go a step further and integrate this paging logic with a GridView through a Data Access Layer.

Paging in SQL Server 2005

More by Author

Best Video Game Development Tools

Video Game Careers Overview

The Top Task Management Software for Developers

Best Online Courses for .NET Developers

Get the Free Newsletter!

Must Read

Different Types of JIT Compilers in .NET

Middleware in ASP.NET Core

Intro to Intel oneDAL and ML.NET

Types of Query Execution in LINQ

Advertisers

Menu

Our Brands

Paging in SQL Server 2005

More by Author

News & Trends

Get the Free Newsletter!

Must Read

Advertisers

Menu

Our Brands