Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How bad is ignoring Oracle DUP_VAL_ON_INDEX exception?

I have a table where I'm recording if a user has viewed an object at least once, hence:

 HasViewed
     ObjectID  number (FK to Object table)
     UserId    number (FK to Users table)

Both fields are NOT NULL and together form the Primary Key.

My question is, since I don't care how many times someone has viewed an object (after the first), I have two options for handling inserts.

  • Do a SELECT count(*) ... and if no records are found, insert a new record.
  • Always just insert a record, and if it throws a DUP_VAL_ON_INDEX exceptions (indicating that there already was such a record), just ignore it.

What's the downside of choosing the second option?

UPDATE:

I guess the best way to put it is : "Is the overhead caused by the exception worse than the overhead caused by the initial select?"

like image 663
James Curran Avatar asked Dec 08 '08 20:12

James Curran


2 Answers

I would normally just insert and trap the DUP_VAL_ON_INDEX exception, as this is the simplest to code. This is more efficient than checking for existence before inserting. I don't consider doing this a "bad smell" (horrible phrase!) because the exception we handle is raised by Oracle - it's not like raising your own exceptions as a flow-control mechanism.

Thanks to Igor's comment I have now run two different benchamrks on this: (1) where all insert attempts except the first are duplicates, (2) where all inserts are not duplicates. Reality will lie somewhere between the two cases.

Note: tests performed on Oracle 10.2.0.3.0.

Case 1: Mostly duplicates

It seems that the most efficient approach (by a significant factor) is to check for existence WHILE inserting:

prompt 1) Check DUP_VAL_ON_INDEX
begin
   for i in 1..1000 loop
      begin
         insert into hasviewed values(7782,20);
      exception
         when dup_val_on_index then
            null;
      end;
   end loop
   rollback;
end;
/

prompt 2) Test if row exists before inserting
declare
   dummy integer;
begin
   for i in 1..1000 loop
      select count(*) into dummy
      from hasviewed
      where objectid=7782 and userid=20;
      if dummy = 0 then
         insert into hasviewed values(7782,20);
      end if;
   end loop;
   rollback;
end;
/

prompt 3) Test if row exists while inserting
begin
   for i in 1..1000 loop
      insert into hasviewed
      select 7782,20 from dual
      where not exists (select null
                        from hasviewed
                        where objectid=7782 and userid=20);
   end loop;
   rollback;
end;
/

Results (after running once to avoid parsing overheads):

1) Check DUP_VAL_ON_INDEX

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.54
2) Test if row exists before inserting

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.59
3) Test if row exists while inserting

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.20

Case 2: no duplicates

prompt 1) Check DUP_VAL_ON_INDEX
begin
   for i in 1..1000 loop
      begin
         insert into hasviewed values(7782,i);
      exception
         when dup_val_on_index then
            null;
      end;
   end loop
   rollback;
end;
/

prompt 2) Test if row exists before inserting
declare
   dummy integer;
begin
   for i in 1..1000 loop
      select count(*) into dummy
      from hasviewed
      where objectid=7782 and userid=i;
      if dummy = 0 then
         insert into hasviewed values(7782,i);
      end if;
   end loop;
   rollback;
end;
/

prompt 3) Test if row exists while inserting
begin
   for i in 1..1000 loop
      insert into hasviewed
      select 7782,i from dual
      where not exists (select null
                        from hasviewed
                        where objectid=7782 and userid=i);
   end loop;
   rollback;
end;
/

Results:

1) Check DUP_VAL_ON_INDEX

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.15
2) Test if row exists before inserting

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.76
3) Test if row exists while inserting

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.71

In this case DUP_VAL_ON_INDEX wins by a mile. Note the "select before insert" is the slowest in both cases.

So it appears that you should choose option 1 or 3 according to the relative likelihood of inserts being or not being duplicates.

like image 176
Tony Andrews Avatar answered Oct 01 '22 20:10

Tony Andrews


I don't think there is a downside to your second option. I think it's a perfectly valid use of the named exception, plus it avoids the lookup overhead.

like image 33
kurosch Avatar answered Oct 01 '22 20:10

kurosch