Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Master-slave vs peer-to-peer distributed computing

HBase has a master-slave model, while Cassandra has a peer-to-peer model. I am aware that in a master-slave model, the master is a SPOF (Single Point of Failure) and there is no such thing in a peer-to-peer model.

Are there any other pros and cons of each model? Specially I am looking for any advantages of master-slave over the peer-to-peer model.

like image 398
Praveen Sripati Avatar asked Jan 24 '12 14:01

Praveen Sripati


1 Answers

One side point is that the master is not a SPOF in HBase, as you can have a Multi-Master configuration. http://wiki.apache.org/hadoop/Hbase/MultipleMasters

Having the masters makes it a little easier to know where the data is and where it is going. It's also based on Hadoop, so the integration with Map Reduce is quite nice (where a Map job will naturally split out to the region servers and give you a row). I think this is the main plus.

Cassandra's primary "con" is the eventual consistency model, although it allows you to choose consistency models.

One comparison point is that data in HBase is sorted by key, where it is random in Cassandra. This can provide some benefits with smart keys in HBase, although you can always choose a GUID or random key to emulate Cassandra's behavior. Cassandra can partition non-randomly, but HBase is still better for range scans.

I've used both, and they both work, and both take a lot of work to keep working.

like image 192
MattMcKnight Avatar answered Oct 06 '22 09:10

MattMcKnight