Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Paging, sorting and filtering in a stored procedure (SQL Server)

I was looking at different ways of writing a stored procedure to return a "page" of data. This was for use with the ASP ObjectDataSource, but it could be considered a more general problem.

The requirement is to return a subset of the data based on the usual paging parameters; startPageIndex and maximumRows, but also a sortBy parameter to allow the data to be sorted. Also there are some parameters passed in to filter the data on various conditions.

One common way to do this seems to be something like this:

[Method 1]

;WITH stuff AS (
    SELECT 
        CASE 
            WHEN @SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
            WHEN @SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
            WHEN @SortBy = ... 
            ELSE ROW_NUMBER() OVER (ORDER BY whatever)
        END AS Row,
        ., 
        ., 
        .,
    FROM Table1
    INNER JOIN Table2 ...
    LEFT JOIN Table3 ...
    WHERE ... (lots of things to check)
    ) 
SELECT *
FROM stuff 
WHERE (Row > @startRowIndex)
AND   (Row <= @startRowIndex + @maximumRows OR @maximumRows <= 0)
ORDER BY Row

One problem with this is that it doesn't give the total count and generally we need another stored procedure for that. This second stored procedure has to replicate the parameter list and the complex WHERE clause. Not nice.

One solution is to append an extra column to the final select list, (SELECT COUNT(*) FROM stuff) AS TotalRows. This gives us the total but repeats it for every row in the result set, which is not ideal.

[Method 2]
An interesting alternative is given here (http://www.4guysfromrolla.com/articles/032206-1.aspx) using dynamic SQL. He reckons that the performance is better because the CASE statement in the first solution drags things down. Fair enough, and this solution makes it easy to get the totalRows and slap it into an output parameter. But I hate coding dynamic SQL. All that 'bit of SQL ' + STR(@parm1) +' bit more SQL' gubbins.

[Method 3]
The only way I can find to get what I want, without repeating code which would have to be synchronized, and keeping things reasonably readable is to go back to the "old way" of using a table variable:

DECLARE @stuff TABLE (Row INT, ...)

INSERT INTO @stuff
SELECT 
    CASE 
        WHEN @SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
        WHEN @SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
        WHEN @SortBy = ... 
        ELSE ROW_NUMBER() OVER (ORDER BY whatever)
    END AS Row,
    ., 
    ., 
    .,
FROM Table1
INNER JOIN Table2 ...
LEFT JOIN Table3 ...
WHERE ... (lots of things to check)

SELECT *
FROM stuff 
WHERE (Row > @startRowIndex)
AND   (Row <= @startRowIndex + @maximumRows OR @maximumRows <= 0)
ORDER BY Row

(Or a similar method using an IDENTITY column on the table variable). Here I can just add a SELECT COUNT on the table variable to get the totalRows and put it into an output parameter.

I did some tests and with a fairly simple version of the query (no sortBy and no filter), method 1 seems to come up on top (almost twice as quick as the other 2). Then I decided to test probably I needed the complexity and I needed the SQL to be in stored procedures. With this I get method 1 taking nearly twice as long as the other 2 methods. Which seems strange.

Is there any good reason why I shouldn't spurn CTEs and stick with method 3?


UPDATE - 15 March 2012

I tried adapting Method 1 to dump the page from the CTE into a temporary table so that I could extract the TotalRows and then select just the relevant columns for the resultset. This seemed to add significantly to the time (more than I expected). I should add that I'm running this on a laptop with SQL Server Express 2008 (all that I have available) but still the comparison should be valid.

I looked again at the dynamic SQL method. It turns out I wasn't really doing it properly (just concatenating strings together). I set it up as in the documentation for sp_executesql (with a parameter description string and parameter list) and it's much more readable. Also this method runs fastest in my environment. Why that should be still baffles me, but I guess the answer is hinted at in Hogan's comment.

like image 735
Fruitbat Avatar asked Mar 13 '12 21:03

Fruitbat


People also ask

How does pagination work in database?

Pagination is a strategy employed when querying any dataset that holds more than just a few hundred records. Thanks to pagination, we can split our large dataset into chunks ( or pages ) that we can gradually fetch and display to the user, thus reducing the load on the database.

How do I pass an array to a stored procedure in SQL Server?

You can convert your array to string in C# and pass it as a Stored Procedure parameter as below, int[] intarray = { 1, 2, 3, 4, 5 }; string[] result = intarray. Select(x=>x. ToString()).


1 Answers

I would most likely split the @SortBy argument into two, @SortColumn and @SortDirection, and use them like this:

…
ROW_NUMBER() OVER (
  ORDER BY CASE @SortColumn
    WHEN 'Name'      THEN Name
    WHEN 'OtherName' THEN OtherName
    …
  END *
  CASE @SortDirection
    WHEN 'DESC' THEN -1
    ELSE 1
  END
) AS Row
…

And this is how the TotalRows column could be defined (in the main select):

…
COUNT(*) OVER () AS TotalRows
…
like image 119
Andriy M Avatar answered Sep 28 '22 04:09

Andriy M