Master-slave vs peer-to-peer distributed computing

Question

HBase has a master-slave model, while Cassandra has a peer-to-peer model. I am aware that in a master-slave model, the master is a SPOF (Single Point of Failure) and there is no such thing in a peer-to-peer model.

Are there any other pros and cons of each model? Specially I am looking for any advantages of master-slave over the peer-to-peer model.

MattMcKnight · Accepted Answer

One side point is that the master is not a SPOF in HBase, as you can have a Multi-Master configuration. http://wiki.apache.org/hadoop/Hbase/MultipleMasters

Having the masters makes it a little easier to know where the data is and where it is going. It's also based on Hadoop, so the integration with Map Reduce is quite nice (where a Map job will naturally split out to the region servers and give you a row). I think this is the main plus.

Cassandra's primary "con" is the eventual consistency model, although it allows you to choose consistency models.

One comparison point is that data in HBase is sorted by key, where it is random in Cassandra. This can provide some benefits with smart keys in HBase, although you can always choose a GUID or random key to emulate Cassandra's behavior. Cassandra can partition non-randomly, but HBase is still better for range scans.

I've used both, and they both work, and both take a lot of work to keep working.

Master-slave vs peer-to-peer distributed computing

Tags:

cassandra

distributed-computing

hbase

p2p

master-slave

Praveen Sripati

1 Answers

MattMcKnight

Recent Activity

Donate For Us

Master-slave vs peer-to-peer distributed computing

Tags:

cassandra

distributed-computing

hbase

p2p

master-slave

Praveen Sripati

1 Answers

MattMcKnight

Related questions

Recent Activity

Donate For Us