Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strategies for Java ORM with Unreliable Network and Low Bandwidth

I am looking at Hibernate for a system which needs to work in an unreliable network. There is a single central database that we need read-write access to, but it is available over a pretty patchy wi-fi network. In addition, there may be power losses which do not shutdown the application cleanly, so any solution must have a persistent cache which can survive power-cycles. Lastly this is an embedded system with only modest memory, and disk space so for example doing full blown replication of the database is not a feasible strategy.

I have a basic understanding of Hibernate 2nd Level caching, and I am wondering if it is possible to configure this with something like Ehcache to solve this problem, but the main thrust of that seems to be performance not availability, so I am not aware of what the pitfalls might be.

I am also quite willing to consider other strategies which involve replication to a local database. I would rather not have to do too much of the heavy lifting myself to implement this.

Looking for some experience or possible alternatives.

like image 463
Dean Povey Avatar asked Apr 30 '11 22:04

Dean Povey


2 Answers

"In addition, there may be power losses which do not shutdown the application cleanly, so any solution must have a persistent cache which can survive power-cycles."

You already have a solution in your mind with Hibernate level 2 cache. But you didn't say what are the real requirements. You have an unrealiable network. That's OK, you have unrealiable power supply. That's Ok too. Now what level of service do you want to achieve ? What is acceptable or not ?

Is data loss acceptable ? How much could you accept ? What risk do you accept ?

To be more explicit, let say you have a local replica of the database or at least part of it. Let say you know how to queue/save modification made locally. Let say you store theses modification on a harddrive so to be safe in case of power failure. Let say you are able to merge changes with the main database when connection is avaialable again.

That's already a lot of assumptions. Ok but what happens if one harddrive fail after a powerfailure ? You know that harddrive don't like power failure and tend to be corrupted on power failure or even can be damaged ?

So you put on a RAID, and add an uninterruptible power supply. That's nice. Your detect power failure event from the OS. Finish your current transaction and correctly shutdown. You RAID protect you from a disk failure.

Ok, but what happens if the whole computer stop functionning ? What happens in case of fire ? Or water damage ? All disk will be managed, data unrecoverable and what is not synchronized with the central database is lost. Is it acceptable or not ?

Even when the wifi is on, the power supply work perfectly... What is the reliability of the central database anyway ? Do you have regular backups ? Or a clustering solution ? Are you sure your central database is reliable anyway ?

From a Database point of view, it is easy to use a cluster or backup and use transactions to ensure dataconsistency. You can still loose data (if not using a cluster in particular), but you should be able to recover up to the last backup for exemple.

But if you want to work offline (with database not available), and you are not the only one that can modify the database, conflicts WILL occurs. This is no longer a cache, hibernate or anything technical problem.

This is functional problem. What to do when several modifications occurs offline and you have to merge ? What is acceptable ? What is not. This might be that on reconnect, the most recent change apply, older changes are discarded. Or ptential conflicts are detected and prompts user to deal with them. You can try to apply queued change and apply all of them...

I would tend to consider that you can offer an "offline mode" but your users must be aware they are offline, and should have a notification when the change are being made permanent on central database with eventual conflict resolution. But that my point of view.

like image 140
Nicolas Bousquet Avatar answered Oct 19 '22 08:10

Nicolas Bousquet


You can't expect to succeed with a network like that between hibernate and the database.

I recommend that you define a set of high-level atomic operations, and then define a set of (e.g.) restful services for them. Or, if you like, you can use soap and look into the WS-* options for reliable messaging to take care of retries and all the other messy details.

Or, you could investigate whether something like cassandra across the link would work better than SQL, or something else big on replication.

like image 41
bmargulies Avatar answered Oct 19 '22 08:10

bmargulies