Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best way of synchronizing data between decoupled systems?

I have let's say 2 (but they'll become more in the future) fully decoupled systems: system A and system B.

Let's say every piece of information on each system has an informationID. There's nothing stopping the informationID to be the same on different systems. What univocally identifies a piece of information across all systems is a Source-informationID pair.

Let's say I need to export a piece of information from System A to system B. I then want to export the same piece of information from System B and re-import it into System A and I need to be able to recognize that's the same piece of information.

What's the best way of doing this in people's experience?

This is what I am thinking to do:

  1. Setup a message bus between the systems with message queues.
  2. Setup endpoints for each system that will monitor changes and generate commands wrapped into messages that will be pumped into queues (for example when a piece of information is created/deleted/updated).
  3. Assign ranks to the endpoints relative to create/delete/update commands in order no to rely on system names but only on a general hierarchy - so that each system doesn't need to know about the others.
  4. Assign a treshold on update/delete/create command to each endpoint so that commands not meeting the treshold requirement will be filtered out and not processed

This won't solve the fact that I still need to carry around originalSource+originalSourceID though.

Any help appreciated.

like image 390
JohnIdol Avatar asked Dec 15 '08 19:12

JohnIdol


People also ask

How to synchronize data from two or more systems?

Data from two systems can be synced in one direction (one-way sync) or in two directions (two-way sync). In n8n, the Merge nodecan help in the process of syncing data. This node allows you to choose between eight different ways of merging data from two sources in order to then synchronize them. Aggregation of data from different sources

What are the five phases of synchronizing my data?

In order to successfully synchronize your data it must pass through five phases: 1 Extraction from the source 2 Transfer 3 Transformation 4 Transfer 5 Load to target More ...

Why do you need a synchronization tool?

A synchronization tool ensures that changes made to your data are updated in a manner that meets the standards set by your specific security needs. Data breach or leaks, problems with trade licenses or government regulations, and reputation loss are just a few of the negative consequences of a system that doesn’t work.

Why is data synchronization important in the cloud?

Especially as the cloud produces a large volume of data, synchronization must be a priority to keep performance at its peak. Data formats must change and grow with the addition of new vendors and customers, as well as to meet the needs of continuous technological advances.


2 Answers

As somebody already wrote, this sounds like a typical EAI problem. Even if EAI tools used to be expensive, now there is a wide choice of free, open-source tools. Below a list of the ones I like most

  1. OpenESB
  2. Mule
  3. Apache ServiceMix
  4. Apache Camel

My favorite is OpenESB, I know it best, it has a full IDE (Netbeans), optional support from a big vendor and a huge amount of additional components. For its simplicity and effectiveness I then love Apache Camel, but you can try some of those and decide which one works better for you. Then you can even decide to buy support services for all of those.

like image 85
Maurizio Avatar answered Sep 30 '22 04:09

Maurizio


This problem has been addressed by EAI (Enterprise Application Integration) vendors like Tibco and webMethods (now part of Software AG). I've never used Tibco before, but I've used webMethods to solve these kind of problems so I'll just focus on webmethods. For example, in an enterprise, data about employees could reside in both Active Directory and PeopleSoft. webMethods could be used to ensure changes, additions, deletes in one system (application) will be reflected in the other in real time. In some other organization, data about employees could also be in an Oracle or SQL Server database. Again, not a problem. These EAI tools like webMethods can talk to a wide variety of back-ends. webMethods is not limited to a single source and a single target, but because it has a publish-subscribe architecture, data from a single source can flow to multiple interested targets who subscribe to a particular piece of information. Guaranteed delivery and may other features can be found in these products. Back to the employee example, ultimately if one does it right, at any given time, all systems and applications in an enterprise can contain the same information about the employees without any discrepancy.

So instead of doing programming in C# or Java, you'll be doing webMethods programming which is very much like a 4GL language. I call it programming because there are still logic involved, loop, if then else, branch, variables, packages, etc but it's very procedure oriented, i.e. no concept of OOP at all.

These EAI tools are built with limited purposes in mind and one of the purposes is to synchronize data between disparate systems in an enterprise easily. And they do their job very well.

The drawback is these tools cost a lot of money. Companies often have a long-term strategy before investing in these tools.

like image 27
Kevin Le - Khnle Avatar answered Sep 30 '22 03:09

Kevin Le - Khnle