Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with old, obsolete database data of a long running system?

What are the possibilities of a programmer to handle data that are rarely used but cannot be simply deleted because at least reporting still requires it?

Some examples I am thinking of:

  • Discountinued funding types of older years of a university
  • Unused currencies (e.g. Italian lira)
  • Names of disappeared countries (e.g. Austro-Hungary, USSR)

Some partial solutions are activity flags, activity periods, priorities of visualization but each of them means a case by case decision and it is hard to know what types of entities need this special handling.

May be there is a design pattern for this problem.

Conclusions: (based on the answers so far)

  • If old data makes everyday work difficult on a huge database, partitioning would be helpful. Oracle's description on this subject is here.

  • From the point of view of the designer the taxonomy of Slowly changing dimension gives some background information.

like image 461
rics Avatar asked Oct 12 '08 10:10

rics


People also ask

What is obsolete database?

Obsolete data is information that is incorrect, incomplete or simply no longer in use. It can include outdated data that has been superseded by new information.

Are SQL databases obsolete?

It's Time to Learn SQL! SQL will not be replaced for a long time, as it has a number of key benefits: It is very popular among data professionals. Leading tech companies rely on relational databases and SQL.

Is DBMS obsolete?

It's been nearly 40 years since the birth of the relational database, and while the RDBMS is still prevalent and useful, massive changes in the way we produce, store, and use data are quickly making the relational database obsolete.


2 Answers

With old data not used in most queries the best solution is to partition tables by the the key which differentiates stale from current data (such as date, currency_id or things like that). You can then put the stale data in separate tables, databases or even servers (depending on the configuration you have running).

The downside to this is that your application must become partition-aware to know where to find the data (though there are abstractions that help deal with sharding and partioning).

like image 74
Eran Galperin Avatar answered Sep 27 '22 17:09

Eran Galperin


For any entity which can have a limited lifetime, just add a time-component in its definition. E.g. your Italian Lira can be modeled as:

CREATE TABLE Currency (CurrencyID NUMBER, CurrencyStartDate DATETIME, CurrentEndDate DATETIME)

You can then exclude the expired currencies from any application functions related to current activity, and still maintain the relationship for historical data.

like image 35
Andrew not the Saint Avatar answered Sep 27 '22 16:09

Andrew not the Saint