Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which is the best choice in delete-insert vs if-update else-insert?

Update:

My bad...I have an primary key on those tables..I meant no further indexing currently on the tables. We might have it in the future after seeing the performance and since we have too many filters on the data in retrieving data it did not show much improvement on indexing last time we ran database tuning.

I have a 4 huge tables over millions of records. Now there is stored procedure which is called frequently and updates these table. Here is the scenario -

Now if entry exists for today I need to update it for today and else if entry is not there for the user I need to go ahead and insert an entry for the user. Now there are two ways to go about carrying out these since there is a single proc that does this -

First Way -

IF EXISTS(TABLE1)
--UPDATE where condn
ELSE
--INSERT
IF EXISTS(TABLE2)
--UPDATE where condn
ELSE
--INSERT
IF EXISTS(TABLE3)
--UPDATE where condn
ELSE
--INSERT
IF EXISTS(TABLE4)
--UPDATE where condn
ELSE
--INSERT

Second Way -

DELETE from TABLE1 where condn
DELETE from TABLE2 where condn
DELETE from TABLE3 where condn
DELETE from TABLE4 where condn

INSERT TABLE1 ENTRY
INSERT TABLE2 ENTRY
INSERT TABLE3 ENTRY
INSERT TABLE4 ENTRY

Now the second way looks simpler but it might be more time consuming ...I am not sure which way is the best here. Can anyone please help or guide me out here..thanks!

like image 641
Vishal Avatar asked Dec 02 '10 19:12

Vishal


People also ask

Is it better to delete and insert or update?

For best future query performance, it's better to do an update to keep the same extents. Delete and insert will not necessarily use the same extents. For a table of that size, it would be unlikely to do so.

Which is faster delete or insert?

Inserting rows in a table is faster than deleting them. Loading data into a new table using create-table-as-select (CTAS) is faster still. So if you're removing most of the rows from a table, instead of issuing a delete you can: Create a new table saving the rows you want to keep.

Which is faster update or insert?

In SQL, i think inserting would be faster for the simple reason that while inserting you don't have to look up for anything, however for updating , first of all you would have to find the tuple of the data and then perform the update operation. Hence insert is better than update.

Why insert is faster than delete?

There is a high redo size for a delete compared to an insert.


2 Answers

If you expect mostly inserts, try this

...
BEGIN TRY
   INSERT table1
END TRY
BEGIN CATCH
   IF ERROR_NUMBER = 2627
       UPDATE table1
   ELSE
       --process real error
END CATCH
...

Mostly updates...

...
BEGIN TRY
   UPDATE table1 ... WHERE ...
   IF @@ROWCOUNT = 0
      INSERT Table1
END TRY
BEGIN CATCH
   --optional. if someone manages to insert before here, do we update it? or just ignore it?
   IF ERROR_NUMBER = 2627
       UPDATE table1
   ELSE
       --process real error
END CATCH
...

EXISTS isn't concurrent enough under high loads. If you are going to scan the table with EXISTS, you may as well just try the INSERT anyway.

Other answers: One, Two, Three

Edit: I call this the JFDI pattern...

like image 92
gbn Avatar answered Oct 02 '22 16:10

gbn


The if-exists-then-update-else-insert approach may be more code than delete-insert, but (depending on how many and what kind of indexes are defined on the tables) it is a lot less work for the server.

  • A DELETE or INSERT operation requires every index to be modified, period.
  • An UPDATE operation only requires these indexes modified whose fields have been updated in this instance.

So unless you modify every indexed field of the record with your updates, the longer approach is the more efficient one.


EDIT: Your question update says that currently you do not have any indexes apart from the primary key (which I assume is a clustered key). So in comparison:

When the row already exists, it's 1 clustered index seek (assuming you find records by their PK) and:

  • delete-insert: 4 write operations (delete row, delete PK index row, insert row, insert PK index row)
  • check-update/insert: 1 write operation (update row)

When the row does not yet exist, it's 1 clustered index seek and:

  • delete-insert: 2 write operations (insert row, insert PK index row)
  • check-update/insert: 2 write operations (insert row, insert PK index row)

The more indexes exist, the worse will delete/insert perform for rows that already exist. Plus it can cause non-clustered indexes to become unnecessarily fragmented due to avoidable write operations.

like image 33
Tomalak Avatar answered Oct 02 '22 15:10

Tomalak