Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Real-time synchronization of database data across all the clients

What's the best strategy to keep all the clients of a database server synchronized?

The scenario involves a database server and a dynamic number of clients that connect to it, viewing and modifying the data.

I need real-time synchronization of the data across all the clients - if data is added, deleted, or updated, I want all the clients to see the changes in real-time without putting too much strain on the database engine by continuous polling for changes in tables with a couple of million rows.

Now I am using a Firebird database server, but I'm willing to adopt the best technology for the job, so I want to know if there is any kind of already existing framework for this kind of scenario, what database engine does it use and what does it involve?

like image 748
luvieere Avatar asked Nov 20 '09 20:11

luvieere


People also ask

What is real time data synchronization?

Data synchronization, or data sync, is the continual process of keeping a record type identical between two or more systems. This can be done in real time, in near real time, or in batches.

What is database synchronization?

Database synchronization establishes data consistency between databases and automatically copies changes back and forth. Data harmonization over time occurs continuously, and the most trivial case is pulling data from the source database to the destination.

Which is the process of synchronising data cross multiple servers?

Data synchronization is the ongoing process of synchronizing data between two or more devices and updating changes automatically between them to maintain consistency within systems. While the sheer quantity of data afforded by the cloud presents challenges, it also provides the perfect solution for big data.

Which technique can be used to perform data synchronization?

iPaaS are third-party, cloud-based solutions that specialize in integrating business tools. There are a lot of different types of iPaaS tools, not all of which work best for data synchronization. In addition, those that do specialize in data syncing often sync different types of data, so they can be used in tandem.


1 Answers

Firebird has a feature called EVENT that you may be able to use to notify clients of changes to the database. The idea is that when data in a table is changed, a trigger posts an event. Firebird takes care of notifying all clients who have registered an interest in the event by name. Once notified, each client is responsible for refreshing its own data by querying the database.

The client can't get info from the event about the new or old values. This is by design, because there's no way to resolve this with transaction isolation. Nor can your client register for events using wildcards. So you have to design your server-to-client notification pretty broadly, and let the client update to see what exactly changed.

See http://www.firebirdsql.org/doc/whitepapers/events_paper.pdf

You don't mention what client platform or language you're using, so I can't advise on the specific API you would use. I suggest you google for instance "firebird event java" or "firebird event php" or similar, based on the language you're using.


Since you say in a comment that you're using WPF, here's a link to a code sample of some .NET application code registering for notification of an event:

http://www.firebirdsql.org/index.php?op=devel&sub=netprovider&id=examples#3


Re your comment: Yes, the Firebird event mechanism is limited in its ability to carry information. This is necessary because any information it might carry could be canceled or rolled back. For instance if a trigger posts an event but then the operation that spawned the trigger violates a constraint, canceling the operation but not the event. So events can only be a kind of "hint" that something of interest may have happened. The other clients need to refresh their data at that time, but they aren't told what to look for. This is at least better than polling.

So you're basically describing a publish/subscribe mechanism -- a message queue. I'm not sure I'd use an RDBMS to implement a message queue. It can be done, but you're basically reinventing the wheel.

Here are a few message queue products that are well-regarded:

  • Microsoft MSMQ (seems to be part of Windows Professional and Server editions)
  • RabbitMQ (free open-source)
  • Apache ActiveMQ (free open-source)
  • IBM WebSphere MQ (probably overkill in your case)

This means that when one client modifies data in a way that others may need to know about, that client also has to post a message to the message queue. When consumer clients see the message they're interested in, they know to refresh their copy of some data.

like image 164
Bill Karwin Avatar answered Oct 23 '22 05:10

Bill Karwin