Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database triggers / referential integrity and in-memory caching

Tags:

Do you see database triggers / referential integrity rules being used in a way that changes actual data in the database (changing row w in table x causes a change in row y in table z)?

If yes, How does this tie-in with the increasing popularity of in-memory caching (memcache and friends)? After all, these actions occur inside the database but the caching system must be aware of them in order to reflect to correct state (or at least invalidate the possibly changed state). I find it hard to believe that callbacks are implemented for such cases.

Does anyone have real-world experience with such a setup / real-world experience with considering such a setup and abandoning it (which way did you go? if caching, how do you enforce integrity?)

like image 630
Ran Biron Avatar asked Mar 15 '10 22:03

Ran Biron


People also ask

What is in memory cache database?

What is an In-Memory Cache? An in-memory cache is a data storage layer that sits between applications and databases to deliver responses with high speeds by storing data from earlier requests or copied directly from databases.

How Data integrity can be enforced through triggers?

Database Triggers to enforce referential integrity when a required referential integrity rule cannot be enforced using the integrity constraints listed above: update CASCADE, update and delete SET NULL, update and delete SET DEFAULT.

How is database caching done?

A database cache supplements your primary database by removing unnecessary pressure on it, typically in the form of frequently accessed read data. The cache itself can live in a number of areas including your database, application or as a standalone layer.

What is meant by referential integrity?

Referential integrity refers to the relationship between tables. Because each table in a database must have a primary key, this primary key can appear in other tables because of its relationship to data within those tables. When a primary key from one table appears in another table, it is called a foreign key .


1 Answers

Simple Answer:

  • Referential integrity is a must have
  • Caching is a qualified must have
  • Triggers are a nice to have

Longer Answer

I've been developing applications on relational databases since 1993 (Dec RDB since you ask, and on flat file systems before then) and triggers have never been popular with many developers because they can 'delete stuff that you don't want deleting'. Referential integrity is also often frowned on by developers because a database in third normal form with proper referential integrity is difficult to bodge together in a few minutes.

Caching is also often considered quite 'hard' to do right, although I'm not sure why.

Whilst many systems can live without triggers, I'd say that no application database can comfortably survive without referential integrity. Look at the tags on this question, the database behind this site will have a table for the tags (probably called 'Tag') and questions (probably called 'Question'). 'Question' will have a foreign key to the primary key on the Tag table, but as questions can have many tags and tags can have many questions I'd guess that the relationship is like this:

   Question
   (TagId)         1 | Database triggers / referential integrity and in-memory caching
      |  
    -----
    | | |
  QuestionTag
 (QuestionId)       1 | 1  ... 1 | 2  ... 1 | 3 ...
    (TagId)
    | | |
    -----
      |
     Tag            1 | database ... 2 | referential-integrity ... 3 | triggers ...
   (TagId)

This kind of referential integrity is the bedrock of any reliable application and isn't negotiable. You can see how it adds credibility to the application design and confidence in its longevity.

The caching on SO may be turned on for such a thing as tags (although it isn't guaranteed) so assume that the tags are cached in memory and that you have enough reputation to be allowed to add a tag to SO. You add your tag and it may well be persisted to the database instantly - but is the cache then updated?

What you have is a trade-off. Can the site survive without knowing about your new tag? And if so for how long? Fundamentally what is the lifecycle of a tag, as it progresses from being added by a user to being in the database, available to other users, used by other users? The cache will be rebuilt according to the rules set down by the development team - and that rule will be a trade-off essentially so that any new tag is available quickly enough without slowing the application down.

Triggers can enforce referential integrity, say the tag you add is 'rubbish', but by the time the admins see it three questions are tagged 'rubbish'. The admins then decide to delete the 'rubbish' tag - but what about the questions that are tagged with it? If there is a trigger on the 'tag' table that is fired on the delete, it could then run round the 'question' table and remove all references to 'rubbish'. There are lots of alternatives to this approach - many of which are programmatic workrounds - but is there a cleaner alternative?

I've worked on lots of sites in the last 20 years, the good ones use referential integrity and increasingly caching. Triggers that change data anonymously (all they fundamentally are are event driven stored procedures) are not popular and increasingly misunderstood but still have a role.

Caching and referential integrity cannot be considered an either-or - development teams must design applications so that both can be incorporated.

like image 104
amelvin Avatar answered Sep 28 '22 19:09

amelvin