Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine if count is greater than threshold in most efficient way in SQL server?

Tags:

sql-server

I want to select if a user produced more than 1000 logs. Given these queries I let SQL Server Studio display estimated execution plan.

select count(*) from tbl_logs where id_user = 3

select 1 from tbl_logs where id_user = 3 having count(1) > 1000

I thought the second one should be better because it can return as soon as SQL Server found 1000 rows. Whereas the first one returns the actual count of rows.

Also when I profile the queries they are equal in terms of Reads, CPU and Duration.

What would be the most efficient query for my task?

like image 554
FuryFart Avatar asked Dec 24 '22 01:12

FuryFart


2 Answers

This query should also improve the performance :

select 1 from tbl_logs order by 1 offset (1000) rows fetch next (1) rows only

You get a 1 when more than 1.000 rows exists, and an empty dataset when they doesn't.

It only fetches the first 1.001 rows, as Alexander's answer does, but after that it has the advantage that it doesn't need to re-count the rows already fetched.

If you want the result to be exactly 1 or 0, then you could read it like this:

with Row_1001 as (
  select 1 as Row_1001 from tbl_logs order by 1 offset (1000) rows fetch next (1) rows only
)
select count(*) as More_Than_1000_Rows_Exist from Row_1001
like image 129
Marc Guillot Avatar answered May 10 '23 04:05

Marc Guillot


I think, some performance improvement can be achieved this way:

select  1 from (
select top 1001 1 as val from tbl_logs where id_user = 3 
) cnt
having count(*) > 1000

In this example, derived query will fetch only first 1001 rows (if they are exists) and outer query will perform a logical check on count.

However, it will not lead to reduction of reads if table tbl_logs is tiny, so index seek uses so small index that only few pages to be fetched

like image 31
Alexander Volok Avatar answered May 10 '23 05:05

Alexander Volok