Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Effective management of data changes

I have a table called Bookings. This table contains data representing a booking made for a particular service, with many variables.

A while ago I came across a problem with my current data structure whereby any changes to the booking that affected times, dates or prices would affect other associated financial records, bookings lists for dates etc.

My solution at the time was to create a Modifications table which would track any changes made to a Booking. Then, whenever the Booking model was asked to return a booking, it would add on Modifications made (in the afterFind() Cake callback) and present the most up-to-date version of the booking, something like this (excuse the Paint drawing):

enter image description here

This method works fine when you ask the Booking model to return booking #1234. It returns the most up-to-date representation of the booking including all modifications (layered on top of each other), including an array containing all the modifications and the original booking data for reference.

My problem is that I've recently realised that I need to be able to query this model with custom conditions, and if one of those conditions was realised in one of the modifications, the result wouldn't match because the model is searching the original record rather than the finally presented record. Example where I query the model to return rows where abc is blue (not grey):

enter image description here

In that example, the model looks straight at the original data for rows where abc is blue and doesn't return this result, because the blue value is in a Modification which is attached after the original results are found.

What I've done now is put a query into the beforeFind() callback of the Booking model to look for modifications that match the given criteria, joining the booking to make sure that any other criteria still matches. When it returns the blue in example above, it stores that result in an array as a class property and continues with the regular find(), but excludes that booking's ID from being returned (because we've found a more up-to-date verison of it). Then it'll merge them together, sort them again etc in the afterFind().

This works, although it's a little more long-winded that I was hoping for.

After all that, I've realised that in other parts of this application, there are models that are manually joining to the bookings table and searching for bookings. So now I need a way to be able to incorporate the modifications into all of those manual joins straight to the table in MySQL without affecting the original data and preferably without changing too much of my code.

My thoughts were that I need to remove the manual join and create a model association instead. Will the beforeFind() and afterFind() of the Booking model still run when I query say the Customer model which hasMany Bookings (to apply the modifications to each booking)?

My other option was to return more rows from MySQL than necessary by removing any criteria that might be contained in the modifications, then use PHP to filter the results as per my search criteria. This option scared me a little because the result set has the potential to be massive without that criteria...


How can I achieve this data structure? My key requirements are still that I do not want to change the original Booking record, rather add Modification records on top, but I need to be able to query bookings (including modifications) through the model.

I want to try and keep as much of this integration behind the scenes as possible so I won't have to go through my entire application to change n number of queries that look like this:

$get_blue = $this->Booking->find('all', array(
    'conditions' => array(
        'Booking.abc' => 'blue'
    )
));

I want to be able to implicitly include any modifications made to bookings so that the up-to-date booking will be returned in the above query.

The other problem is when the Booking model is manually joined to a search query, like this:

$get_transactions_on_blue_bookings = $this->Transaction->find('all', array(
    'joins' => array(
        array(
            'table' => 'sql_bookings_table', // non-standard Cake format, I know - it's an example
            'alias' => 'Booking',
            'type' => 'LEFT',
            'conditions' => 'Booking.booking_id = Transaction.booking_id'
        )
    ),
    'conditions' => array(
        'Booking.abc' => 'blue'
    )
));

As you can see, the above query won't include the modification in my MSPaint example above, because it's manually joining the table in SQL (the modification integration is in the before and afterFind() callback functions of the Booking model).

Any help on this would be greatly appreciated.

Edit

I know this is long enough already, but I thought I'd add that the reason I want to track these changes and not update the original record is that the financial aspect can't change, because it will affect reporting.

The quickest and easiest solution I can see so far is to apply modifications directly to the original booking in all cases except when it affects financial information, which is still tracked as a modification (because I don't currently need to search based on this info).

like image 341
scrowler Avatar asked Jun 19 '14 01:06

scrowler


2 Answers

It sounds like you're trying to implement a Temporal Database. Temporal support was one of the major additions to the ANSI/ISO SQL:2011 standard. MySQL (like most RDBMS) lags behind the standard. Think of Temporal Database as the DBMS equivalent of CVS/SVN/Git.

By contrast, the traditional database we use without temporal features can be called a Current Database.

In a Current Database, if you try to implement temporal support, you can fail in many ways with different approaches:

  • The one-table approach. When you need to make modifications, you do UPDATEs on your original records, and unless you have some sort of homegrown trigger/audit logic, the history trail is absent. Even if you have an audit/change log, you'd have to do some ugly digging to reconstruct the change history.

  • The two-table approach. Instead of making modifications in-place, you split out your data into two tables, one with the base/original records (e.g. booking), and another table for your changes/modifications/deltas. Then at least you have your original data preserved, but again you have to write complex logic to view the original data with modifications layered on. It gets even worse if you want only some of the modifications applied.

  • The precalculated resultant table approach. You keep 3 or more tables: the base records, the modifications, and also a table which attempts to always have the resultant (keeps up to date the base + modifications). Good luck writing the triggers and procedures to do this calculation whenever you do INSERTs, and Heaven help you if an UPDATE or DELETE is needed. The setup is fragile and could break out of sync, such as deadlocks & rollback. If you don't do this within the DB with triggers/procedures, you could try to implement resultant calculation it in the application code, but have good luck at that -- and it could get ugly with multi-threaded consumers. And still, you don't have easy access to resultants with only some modifications applied.

Conclusion: If you're not limited to MySQL, you should really consider using a DB that has built-in temporal support. Otherwise, you're going to re-implement the wheel.

like image 158
Joshua Huber Avatar answered Oct 17 '22 08:10

Joshua Huber


Instead of applying the modifications to the original record, what if you did the reverse and applied the original record to the modifications? You could modify the modifications table (or a new table) to hold the original record with the modifications applied to it, and direct your searches there.

Another thought is that if the financial data is all that needs to be preserved, why not save it in another field or table and reference it when you need it? I agree that a re-design is probably the best/smartest approach for a long-term solution, but I figured I'd put my ideas on the table in case they can help.

like image 20
Trick Avatar answered Oct 17 '22 08:10

Trick