I'm trying to determine how I should store historical transactional data.
Should I store it in a single table where the record just gets reinserted with a new timestamp each time?
Should I break out the historical data into a separate 'history' table and only keep current data in the 'active' table.
If so, how do I best do that? With a trigger that automatically copies the data to the history table? Or with logic in my application?
Update per Welbog's comment:
There will be large amounts of historical data (hundreds of thousands of rows - eventually potentially millions)
Primarily searches and reporting operations will be run on the historical data.
Performance is a concern. The searches shouldn't have to run all night to produce results.
Audit (security purpose) : Use a common table for all your auditable tables. define structure to store column name , before value and after value fields. Archive/Historical: for cases like tracking previous address , phone number etc.
Data warehouses typically store historical data by integrating copies of transaction data from disparate sources. Data warehouses can also use real-time data feeds for reports that use the most current, integrated information.
The use of historical data has become a standard tool in economics, serving three main purposes: to examine the influence of the past on current economic outcomes; to use unique natural experiments to test modern economic theories; and to use modern economic theories to refine our understanding of important historical ...
Historical data enables the tracking ofimprovement over time which gives key insights. These insights are essential for driving a business. Marketers are always on the run to better understand and segment the customers. Keeping historical data can help marketers understand iftheir customer segment is changing.
If the requirement is solely for reporting, consider building a separate data warehouse. This lets you use data structures like slowly changing dimensions that are much better for historical reporting but don't work well in a transactional system. The resulting combination also moves the historical reporting off your production database which will be a performance and maintenance win.
If you need this history to be available within the application then you should implement some sort of versioning or logical deletion feature or make everything fully contra and restate (i.e. transactions never get deleted, just reversed out and restated). Think very carefully about whether you really need this as it will add a lot of complexity. Making a transactional application that can reconstruct historical state correctly is considerably harder than it looks. Financial software (e.g. insurance underwriting sytems) fails to do this a lot more than you might think.
If you need the history solely for audit logging, make shadow tables and audit logging triggers. This is much simpler and more robust than trying to correctly and comprehensively implement audit logging within the application. The triggers will also pick up changes to the database from sources outside the application.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With