Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Master Data Management - Data redundance

I am currently developing a system that holds all relevant master data like for example a customer, or information about an operational system that exists within our system landscape. The assigned IDs to these entities are unique within the enterprise. When some system stores e.g. customer related data, it has to hold the master data ID of the customer as well. The master data system is based on .Net and a MSSQL 2005.

  1. Now my question is, when developing another system with it's own assemblies, database, etc., that uses data of the MDM system, would you store that data redundantly in the other systems database, create its own business entities (like customer) and hard code the required master data from the MDM in the other database (or by ETL)? That way the other system is detached from the MDM but only stores the global Master Data IDs.

  2. Or would you integrate the assemblies of the MDM into other systems (if .Net of course) and use the data layer of the MDM to load global entities (like a customer)?

  3. Or would you let the other system create it's own entities but for retrieving master data you would use a SOAP interface provided by the MDM.

I tend to use the approach no. 1 because I think it's better to detach other systems from the MDM solution (separation of concerns). Since the MDM solution can hold much more data of a customer entity than I would need in some other system where a just the customer's name is required. Option 3 would be possible but web services might slow down an operational system alot. What do you think?

like image 582
Chris Avatar asked Feb 16 '10 11:02

Chris


2 Answers

  1. This carries a high risk of data getting out of sync followed by a major headache trying to unravel it.

  2. This is a viable option, but you'd have to maintain a decent set of assemblies for people to use. This is a non-trivial task as they need to be robust, well documented, have a usable API plus some sensible release management, much like you'd expect from any 3rd party framework. This is usually a level of rigour above normal LOB development practices in my experience.

  3. Got to be the way to go I'd say. A Service Oriented Architecture allowing other people to access the data, but giving them the flexibility for how they access/consume it.

As you say, performance may be the deciding factor - in which case 1 combined with 3 could be the best. i.e. the local copies are viewed and treated as cached data only rather than reliable up-to-date copies. The application could do a quick check with the master DB to see if the local cache is still valid (much like a HEAD request in HTTP land) and then either use the local data or refresh it from the master database.

like image 134
Paolo Avatar answered Oct 15 '22 20:10

Paolo


The pros and cons of solution 1 are:

Pros: - Faster response time (vs. having to consult the Master at each operation, you could cache something to alleviate this) - Your satellite system can work even if the Master is temporarily unavailable.

Cons: - You risk to work with obsolete data (if you got your ETL refresh at midnight, you won't get any new or updated record until the next midnight cycle) - Of course you can't allow the satellite to ever touch any local copy of the MDM (two-way aligns, especially with multiple different satellites becomes a nightmare).

So depending on the specifics, Solution 1 may be ok. I'd prefer to go for querying the Master every time (again, possibly caching answers for a bit of time) but is more of a personal preference.

like image 39
p.marino Avatar answered Oct 15 '22 19:10

p.marino