I have a table where I'm recording if a user has viewed an object at least once, hence:
HasViewed
ObjectID number (FK to Object table)
UserId number (FK to Users table)
Both fields are NOT NULL and together form the Primary Key.
My question is, since I don't care how many times someone has viewed an object (after the first), I have two options for handling inserts.
What's the downside of choosing the second option?
UPDATE:
I guess the best way to put it is : "Is the overhead caused by the exception worse than the overhead caused by the initial select?"
I would normally just insert and trap the DUP_VAL_ON_INDEX exception, as this is the simplest to code. This is more efficient than checking for existence before inserting. I don't consider doing this a "bad smell" (horrible phrase!) because the exception we handle is raised by Oracle - it's not like raising your own exceptions as a flow-control mechanism.
Thanks to Igor's comment I have now run two different benchamrks on this: (1) where all insert attempts except the first are duplicates, (2) where all inserts are not duplicates. Reality will lie somewhere between the two cases.
Note: tests performed on Oracle 10.2.0.3.0.
Case 1: Mostly duplicates
It seems that the most efficient approach (by a significant factor) is to check for existence WHILE inserting:
prompt 1) Check DUP_VAL_ON_INDEX
begin
for i in 1..1000 loop
begin
insert into hasviewed values(7782,20);
exception
when dup_val_on_index then
null;
end;
end loop
rollback;
end;
/
prompt 2) Test if row exists before inserting
declare
dummy integer;
begin
for i in 1..1000 loop
select count(*) into dummy
from hasviewed
where objectid=7782 and userid=20;
if dummy = 0 then
insert into hasviewed values(7782,20);
end if;
end loop;
rollback;
end;
/
prompt 3) Test if row exists while inserting
begin
for i in 1..1000 loop
insert into hasviewed
select 7782,20 from dual
where not exists (select null
from hasviewed
where objectid=7782 and userid=20);
end loop;
rollback;
end;
/
Results (after running once to avoid parsing overheads):
1) Check DUP_VAL_ON_INDEX
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.54
2) Test if row exists before inserting
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.59
3) Test if row exists while inserting
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.20
Case 2: no duplicates
prompt 1) Check DUP_VAL_ON_INDEX
begin
for i in 1..1000 loop
begin
insert into hasviewed values(7782,i);
exception
when dup_val_on_index then
null;
end;
end loop
rollback;
end;
/
prompt 2) Test if row exists before inserting
declare
dummy integer;
begin
for i in 1..1000 loop
select count(*) into dummy
from hasviewed
where objectid=7782 and userid=i;
if dummy = 0 then
insert into hasviewed values(7782,i);
end if;
end loop;
rollback;
end;
/
prompt 3) Test if row exists while inserting
begin
for i in 1..1000 loop
insert into hasviewed
select 7782,i from dual
where not exists (select null
from hasviewed
where objectid=7782 and userid=i);
end loop;
rollback;
end;
/
Results:
1) Check DUP_VAL_ON_INDEX
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.15
2) Test if row exists before inserting
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.76
3) Test if row exists while inserting
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.71
In this case DUP_VAL_ON_INDEX wins by a mile. Note the "select before insert" is the slowest in both cases.
So it appears that you should choose option 1 or 3 according to the relative likelihood of inserts being or not being duplicates.
I don't think there is a downside to your second option. I think it's a perfectly valid use of the named exception, plus it avoids the lookup overhead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With