In a discussion with a friend, I got to hear two things -
Using constraints causes slight decrease in performance. eg. Consider a uniqueness constraint. Before insertion, DBMS would have to check for the uniqueness in all of existing data, thus causing extra computation.
He suggested to make sure that these constraints are handled at the application level logic itself. eg. Delete rows from both table yourself properly, instead of putting foreign integrity constraint etc.
First one sounds a little logical to me, but the second one seems pretty wrong intuitively. I don't have enough experience in DBMS to really judge these claims though.
Q. Is the claim 1 correct ? If so, is claim 2 even the right way to handle such scenarios ?
It is used to maintain the quality of information. Integrity constraints ensure that the data insertion, updating, and other processes have to be performed in such a way that data integrity is not affected. Thus, integrity constraint is used to guard against accidental damage to the database.
Advantages of Integrity Constraints Because you define integrity constraints using SQL statements, no additional programming is required when you define or alter a table. The SQL statements are easy to write and eliminate programming errors.
Foreign key indexes can significantly improve performance for queries that involve joins between the parent and child tables.
16. What are the benefits of enforcing the integrity constraints as part of thedatabase design and implementation process (instead of doing it in applicationdesign)? It makes it easier to maintain database structure and provides goodrepresentation of data that allows for easy upgrades in the future.
If your data needs to be correct, you need to enforce the constraints, and if you need to enforce the constraints, letting the database do it for you will be faster than anything else (and likely more correct too).
Attempting to enforce something like key uniqueness at the application-level can be done correctly or quickly, but not both. For example, let's say you want to insert a new row. A naive application-level algorithm could look something like this:
And that would actually work in a single-client / single-threaded environment. However, in a concurrent environment, some other client could write that same key value in between your steps 1 and 2, and presto: you have yourself a duplicate in your data without even knowing it!
To prevent such a race condition, you'd have to use some form of locking, and since you are inserting a new row, there is no row to lock yet - you'll likely end-up locking the entire table, destroying scalability in the process.
OTOH, if you let the DBMS do it for you, it can do it in a special way without too much locking, which has been tested and double-tested for correctness in all the tricky concurrent edge cases, and whose performance has been optimized over the time the DBMS has been on the market.
Similar concerns exist for foreign keys as well.
So yeah, if your application is the only one accessing the database (e.g. when using an embedded database), you may get away with application-level enforcement, although why would you if the DBMS can do it for you?
But in a concurrent environment, leave keys and foreign keys to the database - you'll have plenty of work anyway, enforcing your custom "business logic" (that is not directly "declarable" in the DBMS) in a way that is both correct and performant...
That being said, feel free to perform any application-level "pre-checks" that benefit your user experience. But do them in addition to database-level constraints, not instead of them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With