I'm using Oracle 11g, the main table has about 10m records. Here is my query:
SELECT COUNT (*)
FROM CONTACT c INNER JOIN STATUS S ON C.STATUS = S.STATUS
WHERE C.USER = 1 AND S.REQUIRE = 1 AND ROWNUM = 1;
The Cost is 3736, but when I changed it to this form:
SELECT COUNT (*) FROM
(SELECT 1 FROM CONTACT c INNER JOIN STATUS S ON C.STATUS = S.STATUS
WHERE C.USER = 1 AND S.REQUIRE = 1 AND ROWNUM = 1);
The Cost became 5! What's the difference between these 2 queries?
Here are the explain plan for both query:
The first query:
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 10 | 3736 (1)| 00:00:45 |
| 1 | SORT AGGREGATE | | 1 | 10 | | |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | NESTED LOOPS | | 4627 | 46270 | 3736 (1)| 00:00:45 |
| 4 | TABLE ACCESS BY INDEX ROWID| CONTACT | 6610 | 33050 | 3736 (1)| 00:00:45 |
|* 5 | INDEX RANGE SCAN | IX_CONTACT_USR | 6610 | | 20 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IX_CONTACT_STATUS | 1 | 5 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(ROWNUM=1)
5 - access("C"."USER"=1)
6 - access("C"."STATUS"="S"."STATUS" AND "S"."REQUIRE"=1)
The second query:
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 5 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | | |
| 2 | VIEW | | 1 | | 5 (0)| 00:00:01 |
|* 3 | COUNT STOPKEY | | | | | |
| 4 | NESTED LOOPS | | 2 | 20 | 5 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| CONTACT | 3 | 15 | 5 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IX_CONTACT_USR | 6610 | | 3 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | IX_CONTACT_STATUS | 1 | 5 | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(ROWNUM=1)
6 - access("C"."USER"=1)
7 - access("C"."STATUS"="S"."STATUS" AND "S"."REQUIRE"=1)
I executed 2 queries, the first one sometimes cost 45s+ (e.g. first run or change the user id), otherwise it will cost <1s. I totally don't know why it's such different, maybe db cache?
When I executed the second query, I can always get result in 1s. So I think the second one is better, but I don't the reason why it improves a lot.
So to make SELECT COUNT(*) queries fast, here's what to do: Get on any version that supports batch mode on columnstore indexes, and put a columnstore index on the table – although your experiences are going to vary dramatically depending on the kind of query you have.
The simple answer is no – there is no difference at all. The COUNT(*) function counts the total rows in the table, including the NULL values.
The difference is simple: COUNT(*) counts the number of rows produced by the query, whereas COUNT(1) counts the number of 1 values. Note that when you include a literal such as a number or a string in a query, this literal is "appended" or attached to every row that is produced by the FROM clause.
COUNT(*) function returns the number of items in a group, including NULL and duplicate values. COUNT(DISTINCT expression) function returns the number of unique and non-null items in a group. COUNT(ALL expression) evaluates the expression and returns the number of non-null items in a group, including duplicate values.
You can see where the difference comes in by comparing the line in the execution plans that access the CONTACT
table (looks at the rows column, the first one).
First:
| 4 | TABLE ACCESS BY INDEX ROWID| CONTACT | 6610 | 33050 | 3736 (1)| 00:00:45 |
Second:
| 5 | TABLE ACCESS BY INDEX ROWID| CONTACT | 3 | 15 | 5 (0)| 00:00:01 |
In the first example, the ROWNUM = 1
predicate isn't applied until after the CONTACT
table has been accessed, so you're getting 6610
rows returned from this table. Whereas in your second query optimizer has only returned 3
. This is many orders of magnitude less, which is why you're seeing the second query complete quicker.
As to why the second execution of the "slow" query is "fast", you're thinking is correct - the data has been loaded from disk into the buffer cache so access is much quicker.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With