When is referential integrity not appropriate?

Question

I understand the need to have referential integrity for limiting specific values on entry or possibly preventing them from removal upon a request of deletion. However, I am unclear as to a valid use case which would exclude this mechanism from always being used.

I guess this would fall into several sub-questions:

When is referential integrity not appropriate?
Is it appropriate to have fields containing multiple and/or possibly incomplete subsets of a foreign key's list?
Typically, should this be a schema structure design decision or an interface design decision? (Or possibly neither or both)

Thoughts?

Keith Adler · Accepted Answer

When is referential integrity not appropriate?

Referential intergrity if typically not used on Data Warehouses where the data is a read only copy of a transactional datbase. Another example of when you'd not need RI is when you want to log information which includes row ids; maintaining referential integrity for a read-only log table is a waste of database overhead.

Is it appropriate to have fields containing multiple and/or possibly incomplete subsets of a foreign key's list?

Sometimes you care more about capturing data than data quality. Imagine you are aggregating a large amount of data from disparate systems which each in their own right suffer from data quality issues. Sometimes you are after the greater good of data quality and having everything in one place even with broken keys etc. represents a starting point for moving towards true data quality. It's not ideal, but it does happen as the beenfits could outweigh the tradeoffs.

Typically, should this be a schema structure design decision or an interface design decision? (Or possibly neither or both)

Everything about systems development is centered around information security, and a key element of that is data integrity. The database structure should lean towards enforcing these things when possible, however you often are not dealing with modern database systems. Sometimes your data source is an old school AS400 with long-antiquated apps. Sometimes you have to build a data and business layer which provide for data integrity.

Just my thoughts.

Jon Onstott · Answer

The only case I have heard of is if you are going to load a vast amount of data into your database; in that case, it may make sense to turn referential integrity off, as long as you know for certain that the data is valid. Once your loading/migration is complete, referential integrity should be turned back on.

There are arguments about putting data validation rules in programming code vs. the database, and I think it depends on the use cases of your software. If a single application is the only path to the database, you could put validation into the program itself and probably be alright. But if several different programs are using the database at the same time (e.g. your application and your friend's application), you'll want business rules in the database so that your data is always valid.

By 'validation rules', I am talking about rules such as 'items in cart > 0'. You may or may not want validation rules. But I think that primary/foreign keys are always important (or you could find later on that you wish you had them). I think they are required if you want to do replication at some point.

When is referential integrity not appropriate?

Tags:

sql

database

database-schema

referential-integrity

schema-design

Curtis Inderwiesche

2 Answers

Keith Adler

Jon Onstott

Recent Activity

Donate For Us

When is referential integrity not appropriate?

Tags:

sql

database

database-schema

referential-integrity

schema-design

Curtis Inderwiesche

2 Answers

Keith Adler

Jon Onstott

Related questions

Recent Activity

Donate For Us