Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to best handle the storage of historical data?

I'm trying to determine how I should store historical transactional data.

Should I store it in a single table where the record just gets reinserted with a new timestamp each time?

Should I break out the historical data into a separate 'history' table and only keep current data in the 'active' table.

If so, how do I best do that? With a trigger that automatically copies the data to the history table? Or with logic in my application?

Update per Welbog's comment:

There will be large amounts of historical data (hundreds of thousands of rows - eventually potentially millions)

Primarily searches and reporting operations will be run on the historical data.

Performance is a concern. The searches shouldn't have to run all night to produce results.

like image 790
Aaron Palmer Avatar asked Jan 15 '09 17:01

Aaron Palmer


People also ask

How do you store historical data?

Audit (security purpose) : Use a common table for all your auditable tables. define structure to store column name , before value and after value fields. Archive/Historical: for cases like tracking previous address , phone number etc.

How is historical data stored in data warehouse?

Data warehouses typically store historical data by integrating copies of transaction data from disparate sources. Data warehouses can also use real-time data feeds for reports that use the most current, integrated information.

What can you do with historical data?

The use of historical data has become a standard tool in economics, serving three main purposes: to examine the influence of the past on current economic outcomes; to use unique natural experiments to test modern economic theories; and to use modern economic theories to refine our understanding of important historical ...

Why is it important to keep historical data?

Historical data enables the tracking ofimprovement over time which gives key insights. These insights are essential for driving a business. Marketers are always on the run to better understand and segment the customers. Keeping historical data can help marketers understand iftheir customer segment is changing.


1 Answers

If the requirement is solely for reporting, consider building a separate data warehouse. This lets you use data structures like slowly changing dimensions that are much better for historical reporting but don't work well in a transactional system. The resulting combination also moves the historical reporting off your production database which will be a performance and maintenance win.

If you need this history to be available within the application then you should implement some sort of versioning or logical deletion feature or make everything fully contra and restate (i.e. transactions never get deleted, just reversed out and restated). Think very carefully about whether you really need this as it will add a lot of complexity. Making a transactional application that can reconstruct historical state correctly is considerably harder than it looks. Financial software (e.g. insurance underwriting sytems) fails to do this a lot more than you might think.

If you need the history solely for audit logging, make shadow tables and audit logging triggers. This is much simpler and more robust than trying to correctly and comprehensively implement audit logging within the application. The triggers will also pick up changes to the database from sources outside the application.

like image 78
ConcernedOfTunbridgeWells Avatar answered Sep 19 '22 03:09

ConcernedOfTunbridgeWells