Efficient way of getting @@rowcount from a query using row_number

Tags:

I have an expensive query using the row_number over() functionality in SQL Server 2005. I return only a sub list of those records as the query is paginated. However, I would like to also return the total number of records, not just the paginated subset. Running the query effectively twice to get the count is out of the question.

Selecting count(*) is also out of the question as the performance is absolutely terrible when I've tried this.

What I'd really love is @@ROW_NUMBERROWCOUNT :-)

374

asked Jun 24 '09 13:06

Phil Bennett

1 Answers

Over the years a pile of developer sweat has gone into efficiently paging result sets. Yet, there is no one answer--it depends on your use case. Part of the use case is getting your page efficiently, part is figuring out how many rows are in a complete result set. So sorry if i stray a little into paging, but the two are pretty tightly coupled in my mind.

There are a lot of strategies, most of which are bad if you have any sort of data volume & don't fit the use case. While this isn't a complete list, following are some of the options.....

Run Separate `Count(*)`

run a separate query that does a simple "select count(*) from MyTable"
simple and easy for a small table
good on an unfiltered large table that is either narrow or has a compact non-clustered index you can use
breaks down when you have a complicated WHERE/JOIN criteria because running the WHERE/JOIN twice is expensive.
breaks down on a wide index because the number of reads goes up.

Combine `ROW_Number() OVER()` and `COUNT(1) OVER(PARTITION By 1)`

This was suggested by @RBarryYoung. It has the benefit of being simple to implement and very flexible.
The down side is that there are a lot of reasons this can become extremely expensive quickly.
For example, in a DB i'm currently working there is a Media table with about 6000 rows. It's not particularly wide, has a integer clustered PK and, as well as a compact unique index. Yet, a simple COUNT(*) OVER(PARTITION BY 1) as TotalRows results in ~12,000 reads. Compare that to a simple SELECT COUNT(*) FROM Media -- 12 reads. Wowzers.

UPDATE -- the reads issue I mentioned is a bit of red-herring. It turns out, that with windowed functions the unit used to measure reads is kind of mixed. The net result is what appears to be massive numbers of reads. You can see more on the issue here : Why are logical reads for windowed aggregate functions so high?

Temp Tables / Table Variables

There are lots of strategies that take a result set and insert relevant keys or segments of results into temp tables / table variables.
For small/medium sized result sets this can provide great results.
This type of strategy works across almost any platform/version of SQL.
Operating on a result set multiple times (quite often a requirement) is also easy.
The down side is when working with large results sets ... inserting a few million rows into a temp table has a cost.
Compounding the issue, in a high volume system pressure on TempDB can be quite a factor, and temp tables are effectively working in TempDB.

Gaussian Sum / Double Row Number

This idea relies on subset of something the mathematician Gauss figured out (how to sum a series of numbers). The subset is how to get row count from any point in the table.
From a series of numbers (Row_Number()) the row count for 1 to N is (N + 1) - 1. More explanation in the links.
The formula seems like it would net out to just N, but the if you stick with the formula an interesting things happens, you can figure out row count from a page in the middle of the table.
The net result is you do ROW_Number() OVER(Order by ID) and ROW_Number() OVER(Order by ID DESC) then sum the two numbers and subtract 1.
Using my Media table as an example my reads dropped from 12,000 to about 75.
In a larger page you've ended up repeating data many many times, but the offset in reads may be worth it.
I haven't tested this on too many scenarios, so it may fall apart in other scenarios.

Top (@n) / SET ROWCOUNT

These aren't specific strategies per-se, but are optimizations based on what we know about the query optimizer.
Creatively using Top(@n) [top can be a variable in SQL 2008] or SET ROWCOUNT can reduce your working set ...even if you're pulling a middle page of a result set you can still narrow the result
These ideas work because of query optimizer behavior ...a service pack/hotfix can change the behavior (although probably not).
In certian instances SET ROWCOUNT can be a bit in accurate
This strategy doesn't account for getting the full row count, just makes paging more efficient

So what's a developer to do?

Read my good man, read. Here are some articles that I've leaned on...

A More Efficient Method for Paging Through Large Result Sets
Optimising Server-Side Paging - Part I
Optimising Server-Side Paging - Part II
Explaination of the Gaussian Sum
Returning Ranked Results with Microsoft SQL Server 2005
ROW_NUMBER() OVER Not Fast Enough With Large Result Set
Retrieving the First N Records from a SQL Query
Server Side Paging using SQL Server 2005
Why are logical reads for windowed aggregate functions so high?

Hope that helps.

191

answered Sep 23 '22 10:09

EBarr

Related questions
                            
                                How to add a user defined column with a single value to a SQL query
                            
                                Concat all column values in sql
                            
                                SQL injection attack with php
                            
                                Looking for an embeddable SQL beautifier or reformatter [closed]
                            
                                Export SQL query data to Excel
                            
                                How to insert a unique ID into each SQLite row?
                            
                                Distinct random time generation in the fixed interval
                            
                                save var_dump into text file [closed]
                            
                                Export all MS Access SQL queries to text files
                            
                                Stop showing SQL errors in Visual Studio 2010
                            
                                PostgreSQL - GROUP BY clause or be used in an aggregate function
                            
                                SQL Insert Query Using C#
                            
                                Get most common value for each value of another column in SQL
                            
                                Update multiple tables in SQL Server using INNER JOIN [duplicate]
                            
                                Select value if condition in SQL Server [duplicate]
                            
                                Mysql SELECT CASE WHEN something then return field
                            
                                Oracle SQL query for Date format
                            
                                left join turns into inner join
                            
                                sql searching multiple words in a string
                            
                                How to restore a SQL Server 2012 database to SQL Server 2008 R2?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Efficient way of getting @@rowcount from a query using row_number

Tags:

sql

sql-server

sql-server-2005

Phil Bennett

People also ask

1 Answers

Run Separate `Count(*)`

Combine `ROW_Number() OVER()` and `COUNT(1) OVER(PARTITION By 1)`

Temp Tables / Table Variables

Gaussian Sum / Double Row Number

Top (@n) / SET ROWCOUNT

So what's a developer to do?

EBarr

Recent Activity

Donate For Us

Efficient way of getting @@rowcount from a query using row_number

Tags:

sql

sql-server

sql-server-2005

Phil Bennett

People also ask

1 Answers

Run Separate Count(*)

Combine ROW_Number() OVER() and COUNT(1) OVER(PARTITION By 1)

Temp Tables / Table Variables

Gaussian Sum / Double Row Number

Top (@n) / SET ROWCOUNT

So what's a developer to do?

EBarr

Related questions

Recent Activity

Donate For Us

Run Separate `Count(*)`

Combine `ROW_Number() OVER()` and `COUNT(1) OVER(PARTITION By 1)`