Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle primary key vs. index NOT IN performance

I have the following use case:

A table stores the changed as well as the original data from a person. My query is designed to get only one row for each person: The changed data if there is some, else the original data.

I populated the table with 100k rows of data and 2k of changed data. When using a primary key on my table the query runs in less than a half second. If I put an index on the table instead of a primary key the query runs really slow. So I'll use the primary key, no doubt about that.

My question is: Why is the PK approach so much faster than the one with an index?

Code here:

drop table up_data cascade constraints purge;
/
create table up_data(
 pk integer,
 hp_nr integer,
 up_nr integer,
 ps_flag varchar2(1),
 ps_name varchar2(100)
 -- comment this out and uncomment the index below.
, constraint pk_up_data primary key (pk,up_nr) 
);
/
-- insert some data
insert into up_data
select rownum, 1, 0, 'A', 'tester_' || to_char(rownum) 
from dual 
connect by rownum < 100000;
/
-- insert some changed data
-- change ps_flag = 'B' and mark it with a change number in up_nr
insert into up_data
select rownum, 1, 1, 'B', 'tester_' || to_char(rownum) 
from dual 
connect by rownum < 2000;
/
-- alternative(?) to the primary key
-- CREATE INDEX idx_up_data ON up_data(pk, up_nr);
/

The select statement looks like this:

select count(*)
from
(
  select *
  from up_data u1 
  where up_nr = 1 
  or (up_nr = 0 
      and pk not in (select pk from up_data where up_nr = 1)
      )
) u

The statement might be target of optimization but for the moment it will stay like this.

like image 203
Jakob Avatar asked Jun 14 '13 09:06

Jakob


People also ask

Which is faster primary key or index?

Searching data using a Primary Key will help you ensure you have the correct information but doesn't ensure a speedy search result. A Clustered Index will perform our search fast.

Does primary key affect performance?

Primary keys are automatically indexed to facilitate effective information retrieval. The primary key index is the most effective access path for the table. Other columns or combinations of columns may be defined as a secondary index to improve performance in data retrieval.

Do we need index on primary key Oracle?

There is no such thing as a "primary key index". A primary key under the covers will use either a UNIQUE or NON-UNIQUE index. If the primary key is deferrable it'll use a non-unique index. If the column(s) are already indexed with a non-unique index, the primary key constraint will rely on that index.

When should I use primary key or index?

A primary key is unique, whereas an index does not have to be unique. Therefore, the value of the primary key identifies a record in a table, the value of the index not necessarily. Primary keys usually are automatically indexed - if you create a primary key, no need to create an index on the same column(s).


1 Answers

When you create a primary key constraint, Oracle also creates an index to support this at the same time. A primary key index has a couple of important differences over a basic index, namely:

  • All the values in this are guaranteed to be unique
  • There's no nulls in the table rows (of the columns forming the PK)

These reasons are the key to the performance differences you see. Using your setup, I get the following query plans:

--fast version with PK
explain plan for 
select count(*)
from
(
  select *
  from up_data u1 
  where up_nr = 1 
  or (up_nr = 0 
      and pk not in (select pk from up_data where up_nr = 1)
      )
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));

-----------------------------------------------------                                                                                                                                                                                                                                                        
| Id  | Operation              | Name       | Rows  |                                                                                                                                                                                                                                                        
-----------------------------------------------------                                                                                                                                                                                                                                                        
|   0 | SELECT STATEMENT       |            |     1 |                                                                                                                                                                                                                                                        
|   1 |  SORT AGGREGATE        |            |     1 |                                                                                                                                                                                                                                                        
|   2 |   FILTER               |            |       |                                                                                                                                                                                                                                                        
|   3 |    INDEX FAST FULL SCAN| PK_UP_DATA |   103K|                                                                                                                                                                                                                                                        
|   4 |    INDEX UNIQUE SCAN   | PK_UP_DATA |     1 |                                                                                                                                                                                                                                                        
-----------------------------------------------------     

alter table up_data drop constraint pk_up_data;
CREATE INDEX idx_up_data ON up_data(pk, up_nr);
/

--slow version with normal index
explain plan for
select count(*)
from
(
  select *
  from up_data u1 
  where up_nr = 1 
  or (up_nr = 0 
      and pk not in (select pk from up_data where up_nr = 1)
      )
) u
/

select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));

------------------------------------------------------                                                                                                                                                                                                                                                       
| Id  | Operation              | Name        | Rows  |                                                                                                                                                                                                                                                       
------------------------------------------------------                                                                                                                                                                                                                                                       
|   0 | SELECT STATEMENT       |             |     1 |                                                                                                                                                                                                                                                       
|   1 |  SORT AGGREGATE        |             |     1 |                                                                                                                                                                                                                                                       
|   2 |   FILTER               |             |       |                                                                                                                                                                                                                                                       
|   3 |    INDEX FAST FULL SCAN| IDX_UP_DATA |   103K|                                                                                                                                                                                                                                                       
|   4 |    INDEX FAST FULL SCAN| IDX_UP_DATA |  1870 |                                                                                                                                                                                                                                                       
------------------------------------------------------ 

The big difference is that the fast version employs a INDEX UNIQUE SCAN, rather than a INDEX FAST FULL SCAN in the second access of the table data.

From the Oracle docs (emphasis mine):

In contrast to an index range scan, an index unique scan must have either 0 or 1 rowid associated with an index key. The database performs a unique scan when a predicate references all of the columns in a UNIQUE index key using an equality operator. An index unique scan stops processing as soon as it finds the first record because no second record is possible.

This optimization to stop processing proves to be a significant factor in this example. The fast version of your query:

  • Full scans ~103,000 index entries
  • For each one of these finds one matching row in the PK index and stop processing the second index further

The slow version:

  • Full scans ~103,000 index entries
  • For each one of these performs another scan of the 103,000 rows to find if there's any matches.

So to compare the work done:

  • With the PK, we have one fast full scan, then 103,000 lookups of one index value
  • With normal index, we have one fast full scan then 103,000 scans of 103,000 index entries - several orders of magnitude more work!

In this example, both the uniqueness of the primary key and the not null-ness of the index values are necessary to get the performance benefit:

-- create index as unique - we still get two fast full scans
drop index index idx_up_data;
create unique index idx_up_data ON up_data(pk, up_nr);

explain plan for 
select count(*)
from
(
  select *
  from up_data u1 
  where up_nr = 1 
  or (up_nr = 0 
      and pk not in (select pk from up_data where up_nr = 1)
      )
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));

------------------------------------------------------                                                                                                                                                                                                                                                       
| Id  | Operation              | Name        | Rows  |                                                                                                                                                                                                                                                       
------------------------------------------------------                                                                                                                                                                                                                                                       
|   0 | SELECT STATEMENT       |             |     1 |                                                                                                                                                                                                                                                       
|   1 |  SORT AGGREGATE        |             |     1 |                                                                                                                                                                                                                                                       
|   2 |   FILTER               |             |       |                                                                                                                                                                                                                                                       
|   3 |    INDEX FAST FULL SCAN| IDX_UP_DATA |   103K|                                                                                                                                                                                                                                                       
|   4 |    INDEX FAST FULL SCAN| IDX_UP_DATA |  1870 |                                                                                                                                                                                                                                                       
------------------------------------------------------     

-- now the columns are not null, we see the index unique scan
alter table up_data modify (pk not null, up_nr not null);

explain plan for
select count(*)
from
(
  select *
  from up_data u1 
  where up_nr = 1 
  or (up_nr = 0 
      and pk not in (select pk from up_data where up_nr = 1)
      )
) u
/

select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));

------------------------------------------------------                                                                                                                                                                                                                                                       
| Id  | Operation              | Name        | Rows  |                                                                                                                                                                                                                                                       
------------------------------------------------------                                                                                                                                                                                                                                                       
|   0 | SELECT STATEMENT       |             |     1 |                                                                                                                                                                                                                                                       
|   1 |  SORT AGGREGATE        |             |     1 |                                                                                                                                                                                                                                                       
|   2 |   FILTER               |             |       |                                                                                                                                                                                                                                                       
|   3 |    INDEX FAST FULL SCAN| IDX_UP_DATA |   103K|                                                                                                                                                                                                                                                       
|   4 |    INDEX UNIQUE SCAN   | IDX_UP_DATA |     1 |                                                                                                                                                                                                                                                       
------------------------------------------------------  
like image 171
Chris Saxon Avatar answered Oct 29 '22 03:10

Chris Saxon