Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UNIQUE constraint vs checking before INSERT

I have a SQL server table RealEstate with columns - Id, Property, Property_Value. This table has about 5-10 million rows and can increase even more in the future. I want to insert a row only if a combination of Id, Property, Property_Value does not exist in this table.

Example Table -

1,Rooms,5 1,Bath,2 1,Address,New York 2,Rooms,2 2,Bath,1 2,Address,Miami 

Inserting 2,Address,Miami should NOT be allowed. But, 2,Price,2billion is okay. I am curious to know which is the "best" way to do this and why. The why part is most important to me. The two ways of checking are -

  1. At application level - The app should check if a row exists before it inserts a row.
  2. At database level - Set unique constraints on all 3 columns and let the database do the checking instead of person/app.

Is there any scenario where one would be better than the other ?

Thanks.

PS: I know there is a similar question already, but it does not answer my problem - Unique constraint vs pre checking Also, I think that UNIQUE is applicable to all databases, so I don't think I should remove the mysql and oracle tags.

like image 276
Steam Avatar asked Feb 19 '14 19:02

Steam


People also ask

Does unique constraint improve performance?

Without the unique constraint the optimizer has to interpolate from the statistics object created on that column, which is a perpetual source of execution plan problems. Of course the index created to enforce the unique constraint is a great source of performance improvement, too.

What is the purpose of a unique constraint?

The Unique constraint is a column constraint used to ensure unique values in the column. It prevents duplicate values from appearing in a column for two or more rows.

What is difference between unique constraint and primary constraint?

PRIMARY KEY constraint differs from the UNIQUE constraint in that; you can create multiple UNIQUE constraints in a table, with the ability to define only one SQL PRIMARY KEY per each table. Another difference is that the UNIQUE constraint allows for one NULL value, but the PRIMARY KEY does not allow NULL values.

How do I check if a column has a unique constraint?

To check for a unique constraint use the already provided method: select count(*) cnt from user_constraints uc where uc. table_name='YOUR_TABLE_NAME' and uc.


1 Answers

I think it most cases the differences between that two are going to be small enough that the choice should mostly be driven by picking the implementation that ends up being most understandable to someone looking at the code for the first time.

However, I think exception handling has a few small advantages:

  • Exception handling avoids a potential race condition. The 'check, then insert' method might fail if another process inserts a record between your check and your insert. So, even if you're doing 'check then insert' you still want exception handling on the insert and if you're already doing exception handling anyways then you might as well do away with the initial check.

  • If your code is not a stored procedure and has to interact with the database via the network (i.e. the application and the db are not on the same box), then you want to avoid having two separate network calls (one for the check and the other for the insert) and doing it via exception handling provides a straightforward way of handling the whole thing with a single network call. Now, there are tons of ways to do the 'check then insert' method while still avoiding the second network call, but simply catching the exception is likely to be the simplest way to go about it.

On the other hand, exception handling requires a unique constraint (which is really a unique index), which comes with a performance tradeoff:

  • Creating a unique constraint will be slow on very large tables and it will cause a performance hit on every single insert to that table. On truly large databases you also have to budget for the extra disk space consumed by the unique index used to enforce the constraint.
  • On the other hand, it might make selecting from the table faster if your queries can take advantage of that index.

I'd also note that if you're in a situation where what you actually want to do is 'update else insert' (i.e. if a record with the unique value already exists then you want to update that record, else you insert a new record) then what you actually want to use is your particular database's UPSERT method, if it has one. For SQL Server and Oracle, this would be a MERGE statement.

like image 142
ivanatpr Avatar answered Oct 11 '22 06:10

ivanatpr