Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cassandra replication or raid

Tags:

cassandra

raid

With traditional RDBMS we are used to RAID10 in most cases but if using cassandra RF=2 then we exactly have one copy as backup then in this case why not or why to use RAID10.

I think it will reduce overhead from cassandra for replication..

Moreover in RAID10 if a hard drive fail then whole node will keep on working but if replication is used then one hard drive failure would result in whole node will down?

Though I think using RAID10 there will be overhead on each write but flushing is done when SSTABLE is full so it will not be felt all the time..

like image 631
Gary Lindahl Avatar asked Aug 29 '11 18:08

Gary Lindahl


People also ask

How does Cassandra replicate data?

Cassandra stores data replicas on multiple nodes to ensure reliability and fault tolerance. The replication strategy for each Edge keyspace determines the nodes where replicas are placed. The total number of replicas for a keyspace across a Cassandra cluster is referred to as the keyspace's replication factor.

Which replication strategy is used in Cassandra for multiple data center?

NetworkTopologyStrategy: It is the strategy in which we can store multiple copies of data on different data centers as per need. This is one important reason to use NetworkTopologyStrategy when multiple replica nodes need to be placed on different data centers.

How much data can a single Cassandra node effectively handle?

Maximum recommended capacity for Cassandra 1.2 and later is 3 to 5TB per node for uncompressed data. For Cassandra 1.1, it is 500 to 800GB per node. Be sure to account for replication.


1 Answers

I would argue that RAID 10 is a waste of money. Two reasons:

1) One of the important attributes of BigTable (Cassandra or HBase) is the ability to quickly and cheaply expand your cluster or add redundancy by adding new servers. Based on recent prices, RAID 10 (striping AND spanning) is so expensive that it is virtually the same price as adding another whole server with JBOD storage.

2) Cassandra replication protects you from machine failure, not just disk failure. RAID 10 won't protect you if your CPU dies, but Cassandra replication will do. It will also protect you from disk failure, and will allow multiple clients to read from multiple nodes, preventing hotspots.

like image 81
Chris Shain Avatar answered Oct 23 '22 03:10

Chris Shain