Oracle - Table alias and NULL evaluation in join

Tags:

I was just trying to make an example to explain how NULL in Oracle can lead to 'unexpected' behaviours, but I've found something I did not expect...

setup:

Click to copy

create table tabNull (val varchar2(10), descr varchar2(100));
insert into tabNull values (null, 'NULL VALUE');
insert into tabNull values ('A', 'ONE CHAR');

This gives what I expected:

Click to copy

SQL> select * from tabNull T1 inner join tabNull T2 using(val);

VAL        DESCR                DESCR
---------- -------------------- --------------------
A          ONE CHAR             ONE CHAR

If I remove table aliases, I get:

Click to copy

SQL> select * from tabNull inner join tabNull using(val);

VAL        DESCR                DESCR
---------- -------------------- --------------------
A          ONE CHAR             ONE CHAR
A          ONE CHAR             ONE CHAR

and this is quite surprising to me.

A reason can be found in the execution plans for the two queries; with table aliases, Oracle makes an HASH JOIN and then checks for T1.val = T2.val:

Click to copy

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |   118 |     7  (15)| 00:00:01 |
|*  1 |  HASH JOIN         |         |     1 |   118 |     7  (15)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| TABNULL |     2 |   118 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| TABNULL |     2 |   118 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."VAL"="T2"."VAL")

Without aliases, it first filters one occurrence of the table for not null values, thus picking only one row, and then it makes a CARTESIAN with the second occurrence, thus giving two rows; even if it's correct, I would expect the result of a cartesian, but I don't have any row with DESCR = 'NULL VALUE'.

Click to copy

--------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |     2 |   118 |     6   (0)| 00:00:01 |
|   1 |  MERGE JOIN CARTESIAN|         |     2 |   118 |     6   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL  | TABNULL |     1 |    59 |     3   (0)| 00:00:01 |
|   3 |   BUFFER SORT        |         |     2 |       |     3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL | TABNULL |     2 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("TABNULL"."VAL" IS NOT NULL)

Is this somehow correct / expected? Isn't the result value of the cartesian even stranger than the number of returned rows? Am I misunderstanding the plans, or missing something so big that I can't see?

786

asked Aug 12 '16 12:08

Aleksej

2 Answers

According to http://docs.oracle.com/javadb/10.10.1.2/ref/rrefsqljusing.html using(val) translates here as ON tabnull.val=tabnull.val So

Click to copy

select tabNull.*, tabNull.descr from tabNull inner join tabNull 
on tabNull.val = tabNull.val;

Next to build a plan Oracle must [virtually] assign different aliases for every JOIN member but sees no reason to use second alias at any place in SELECT and ON. So

Click to copy

select t1.*, t1.descr from tabNull t1 inner join tabNull t2 
on t1.val = t1.val;

Plan

Click to copy

--------------------------------------------------------------------------------                        
| Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |                        
--------------------------------------------------------------------------------                        
|   0 | SELECT STATEMENT     |         |     2 |    28 |     4   (0)| 00:00:01 |                        
|   1 |  MERGE JOIN CARTESIAN|         |     2 |    28 |     4   (0)| 00:00:01 |                        
|*  2 |   TABLE ACCESS FULL  | TABNULL |     1 |    14 |     2   (0)| 00:00:01 |                        
|   3 |   BUFFER SORT        |         |     2 |       |     2   (0)| 00:00:01 |                        
|   4 |    TABLE ACCESS FULL | TABNULL |     2 |       |     2   (0)| 00:00:01 |                        
--------------------------------------------------------------------------------                        


Predicate Information (identified by operation id):                                                     
---------------------------------------------------                                                     

   2 - filter("T1"."VAL" IS NOT NULL)

171

answered Oct 03 '22 23:10

Serg

EDIT: I say below that the syntax is illegal; on further thought, that's BS on my part, I don't know that for a fact (I can't point to where in the language definition aliases are required for a self-join). I still believe the explanation below is probably correct, whether it is for the "bug" or for the "undefined behavior" I mention below.

The syntax is illegal (you knew that - you were just curious to see what would happen, and if you can understand the output). I agree with jarlh that you should have received an error message. Clearly Oracle didn't code it that way.

Since this is not valid syntax, what you are seeing can't be called a bug (so I disagree with Nick's comment). The behavior is "undefined" - when you use syntax that is not supported by the Oracle language definition, you may get any kind of crazy results, for which Oracle is not taking any responsibility.

OK, with that out of the way, is there any explanation for what you are seeing? I believe it is indeed a Cartesian join, and not a union as Nick suggested.

Let's put ourselves in the optimizer's shoes. It sees the first table in the FROM list, it scans it, so far so good.

Then it reads the second table, and it has a list of columns like this:

tabNULL.val, tabNULL.descr, tabNULL.val, tabNULL.descr

The join condition is tabNULL.val = tabNULL.val

The optimizer is dumb, it is not smart. It, unlike you, doesn't realize at this point that tabNULL is meant to stand for two different incarnations of the table. It thinks tabNULL.val on both sides of the equation are THE SAME value and they both refer to the first "incarnation" of the table. The only case when that fails is if tabNULL.val is NULL, so it REWRITES the query with the clause becoming tabNULL.val IS NOT NULL.

Only the FIRST table is checked for tabNULL.val IS NOT NULL; the optimizer doesn't "know" tabNULL.val appears again in the list and it may have a DIFFERENT meaning! Then the join happens; at this point there are no other conditions left, so BOTH rows in the second incarnation of the table will produce rows in the join, for A, ONE CHAR from the first table.

Then, in the projection, again only the FIRST tabNULL.val will be read and will populate BOTH columns in the output. You ask the query engine to return the value tabNULL.val twice, and in your mind it's from different places, but there is only one memory location labeled tabNULL.val, and it stores what came from the first table.

Of course, very few know with any certainty what the optimizer and the query engine do, but in this case I think this is a pretty safe guess.

answered Oct 03 '22 21:10

mathguy

Related questions
                            
                                What's the recommended way to pass the results of macro computations to run-time?
                            
                                Pulling data off of sql table with php using sessions
                            
                                Select all XML attributes in SQL
                            
                                subquery - how to refer to outer query value
                            
                                Is Rollback automatic in a "Using" scope with C# SQL Server calls?
                            
                                Full text search in sql server
                            
                                How do i search from serialize field in mysql database?
                            
                                Laravel - multi-insert rows and retrieve ids
                            
                                Fetching rows at extremely high speed
                            
                                SQL query multiple tables, with multiple joins and column field with comma separated list
                            
                                What does parsing a query mean?
                            
                                Converting Multiple rows into columns in MySQL
                            
                                Why is SqlQuery a lot faster than using LINQ expression on views?
                            
                                SQL how to compare two columns from two different tables
                            
                                SqlDependency fires immediately
                            
                                Postgres return null values on function error/failure when casting
                            
                                ROW_Count() To Start Over Based On Order
                            
                                How to GROUP entries BY uninterrupted sequence?
                            
                                Difference in executing query via Hibernate and PostgreSQL leading to PSQLException
                            
                                Sequelize: join two times the same table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Oracle - Table alias and NULL evaluation in join

Tags:

sql

null

oracle

Aleksej

People also ask

2 Answers

Serg

mathguy

Recent Activity

Donate For Us