Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is it that the IsEqual (=) operator is working faster than the IsNotEqual (<>) operator in Oracle?

Tags:

sql

oracle

Like the title says, if anyone has the answer I would like to know. I've been googling but couldn't find a straight answer.

Example:

This works

SELECT COUNT(*) FROM Table1 TB1, Table2 TB2
WHERE TB1.Field1 = TB2.Table2 

This seems to take hours

SELECT COUNT(*) FROM Table1 TB1, Table2 TB2
WHERE TB1.Field1 <> TB2.Table2 
like image 315
jovany Avatar asked Jul 28 '10 11:07

jovany


4 Answers

Because they are different SQL sentences. In the first one, you are joining two tables using Field1 and Table2 fields. Probably returning a few records.

In the second one, your query is probably returning a lot of records, since you are doing a cross join, and a lot of rows will satisfy your Field1 <> Table2 condition.

A very simplified example

Table1

Field1
------
1
2
5
9

Table2

Table2
------
3
4
5
6
9

Query1 will return 2 since only 5 and 9 are common.

Query2 will return 18 since a lot of rows from cross join will count.

If you have table with a lot of records, it will take a while to process your second query.

like image 115
Pablo Santa Cruz Avatar answered Oct 24 '22 10:10

Pablo Santa Cruz


It's important to realize that SQL is a declarative language and not an imperative one. You describe what conditions you want your data to fit and not how those comparisons should be executed. It's the job of the database to find the fastest way to give you an answer (a task taken over by the query optimizer). This means that a seemingly small change in your query can result in a wildly different query plan, which in turn results in a wildly different runtime behaviour.

The = comparison can be converted to and optimized the same way as a simple join on the two fields. This means that normal indices can be used to execute the query very fast, probably without reading the actual data and using only the indices instead.

A <> comparison on the other hand requires a full cartesian product to be calculated and checked for the condition, usually (there might be a way to optimize this with the correct index, but usually an index won't help here). It will also usually return a lot more results, which adds to the execution time.

like image 21
Joachim Sauer Avatar answered Oct 24 '22 10:10

Joachim Sauer


Probably, the second query processes way more rows than the first one.

like image 43
MvanGeest Avatar answered Oct 24 '22 12:10

MvanGeest


(Thinking back to a similar question)

Are you trying to count the rows in Table1 for which there is no matching record in Table2?

If so you could use this

SELECT COUNT(*) FROM Table1 TB1
WHERE NOT EXISTS
     (SELECT * FROM Table2 TB2 
      WHERE TB1.Field1 = TB2.Field2 )

or this for example

SELECT COUNT(*)
FROM
(
SELECT Field1 FROM Table1 
MINUS
SELECT Field2 FROM Table2
) T
like image 1
Martin Smith Avatar answered Oct 24 '22 11:10

Martin Smith