Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoid redundant updates

Tags:

sql

oracle

plsql

I'm working in an IO bound system (and this isn't going to change). So I'm rewriting some sql to only update when it needs to and it's going really well. I'm seeing about a 70% increase in performance. The only problem is the sql is more bloated which isn't the end of the world, just more code to maintain.

So my question is.. Is there an easier way to get Oracle to only update when it needs to compare to adding a where clause:

update table_name
   set field_one = 'one'
 where field_one != 'one';

Note: The real code is much more complex so adding a 'where' like this can sometimes double the length of the query.

Using 11g

like image 731
Bill Avatar asked Jul 28 '14 20:07

Bill


4 Answers

Given the nature of how SQL works, this is exactly what you need to do. If you tell it:

update table_name    
   set field_one = 'one';

that means something entirely different in SQL than

update table_name
   set field_one = 'one'
 where field_one != 'one';

The database can only process what you told it to process, In the first case, because there is no where clause, you have told it to process all the records.

In the second case you have put a filter on it to process only some specific records.

It is up to the code writer not the database to determine the content of the query. If you didn't want every record updated, you should not have told it to do so. The database is quite literal about the commands you give it. Yes the second set of queries are longer becasue they are being more specific. They havea differnt meaning than the orginal queries. That is all to the good as it is far faster to update the ten records you are interested in than all 1,000,000 records in the table.

You need to get over the idea that longer is somehow a bad thing in database queries. Often it is a good thing as you are being more correct in what you are asking for. Your orginal queries were simply incorrect. And now you have pay the price to fix what was a systemically bad practice.

like image 70
HLGEM Avatar answered Oct 06 '22 08:10

HLGEM


I guess there isn't an easier way....

like image 40
Bill Avatar answered Oct 06 '22 10:10

Bill


Create a view over the table and write your custom instead of trigger

--DDL create your sample table:
create table table_name (
   field_one varchar2(100)
  );

--DDL create the view
create view view_name as 
select * from table_name;    

--DDL creating instead trigger
CREATE OR REPLACE TRIGGER trigger_name
   INSTEAD OF UPDATE ON view_name
   FOR EACH ROW
   BEGIN
     UPDATE table_name
     SET field_one = :NEW.field_one
     where :OLD.field_one != :NEW.field_one
     ;
   END trigger_name;

--DML testing:
update view_name set field_one = 'one'

EDITED, test results

I have tested my approach in a 200K rows scenario with 1/20 factor for updatable rows. Here results:

  • Script time updating table directly: 1.559,05 secs
  • Script time updating through trigger: 1.101,14 secs

Steps to repreduce test:

Creating table, view and trigger:

create table table_name (
   myPK int primary key,
   field_1 varchar2(100),
   field_2 varchar2(100),
   field_3 varchar2(4000)
  );

create view view_name as 
select * from table_name;

CREATE OR REPLACE TRIGGER trigger_name
   INSTEAD OF UPDATE ON view_name
   FOR EACH ROW
   BEGIN
     UPDATE table_name
     SET 
        field_1 = :NEW.field_1,
        field_2 = :NEW.field_2
     where 
        myPK = :OLD.myPK
        AND not ( :OLD.field_1 = :NEW.field_1 and
                  :OLD.field_2 = :NEW.field_2 )
     ;
   END trigger_name;

Inserting dummy data with 1/20 factor for updatable rows:

DECLARE
   x NUMBER := 300000;
BEGIN
   FOR i IN 1..x LOOP
      IF MOD(i,20) = 0 THEN     
         INSERT INTO table_name VALUES (i, 'rare', 'hello', 
                                           dbms_random.string('A', 2000));
      ELSE
         INSERT INTO table_name VALUES (i, 'ordinary', 'bye', 
                                           dbms_random.string('A', 2000) );
      END IF;
   END LOOP;
   COMMIT;
END;

Script for testing performace:

declare
    l_start number;
    l_end number;
    l_diff number;    
    rows2update int;

begin
   rows2update := 100000;
   l_start := dbms_utility.get_time ;
   affectedRows := 0;
   FOR i IN 1..rows2update LOOP

      rows2update := rows2update - 1;
      update view_name    --<---- replace by table_name to test without trigger
         set field_1 = 'ordinary' 
       where myPK = round( dbms_random.value(1,300000) )    ; 
      commit;

   end loop;

   l_end := dbms_utility.get_time ;
   dbms_output.put_line('l_start ='||l_start);
   dbms_output.put_line('l_end ='||l_end);
   l_diff := (l_end-l_start)/100;
   dbms_output.put_line('Elapsed Time: '|| l_diff ||' secs');

end;
/

Disclaimer: This is a simple test made in a virtual environment only as a first approach test. I'm sure than results can change significantly just changing field lenght or other parameters.

like image 22
dani herrera Avatar answered Oct 06 '22 08:10

dani herrera


Try to use merge statements. it may reduce the running time of the update queries.

the above query can be rewritten like this,

MERGE INTO table_name
USING ( SELECT ROWID from table_name Where field_one != 'one') data_table
ON ( table_name.ROWID = data_table.ROWID)
WHEN MATCHED THEN
UPDATE SET table_name.field_one = 'one';
like image 23
anandh04 Avatar answered Oct 06 '22 08:10

anandh04