Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices with historical data in MySQL database

Tags:

sql

mysql

Recently I think about the best practices with storing historical data in MySQL database. For now, each versionable table has two columns - valid_from and valid_to, both DATETIME type. Records with current data has valid_from filled with its creation day. When I update this row, I fill valid_to with update date and add new record with valid_from the same as valid_to in previous row - easy stuff. But I know that table will be enormous very quick so fetching data can be very slow.
I'd like to know if you have any practices with storing historical data?

like image 421
Kuba T Avatar asked Jun 11 '13 19:06

Kuba T


1 Answers

It's a common mistake to worry about "large" tables and performance. If you can use indexes to access your data, it doesn't really matter if you have 1000 of 1000000 records - at least not so as you'd be able to measure. The design you mention is commonly used; it's a great design where time is a key part of the business logic.

For instance, if you want to know what the price of an item was at the point when the client placed the order, being able to search product records where valid_from < order_date and valid_until is either null or > order_date is by far the easiest solution.

This isn't always the case - if you're keeping the data around just for archive purposes, it may make more sense to create archive tables. However, you have to be sure that time is really not part of the business logic, otherwise the pain of searching multiple tables will be significant - imagine having to search either the product table OR the product_archive table every time you want to find out about the price of a product at the point the order was placed.

like image 198
Neville Kuyt Avatar answered Oct 19 '22 06:10

Neville Kuyt