Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra node hardware requirement

Tags:

cassandra

We have a lot of keyspaces, RF=3, each keyspace has ~1Gb of data. Can we effectively serve such configuration with small nodes like 4GB RAM + 60GB SSD?

like image 407
maxim_ge Avatar asked Jun 18 '17 14:06

maxim_ge


People also ask

How many Cassandra nodes do I need?

The minimal number should be 5, as a lower number (such as 3) will result in high stress on the machines during node failure (replication factor is 2 in this case, and each node will have to read 50% of the data and write 50% of data).

How much data can a single Cassandra node effectively handle?

Maximum recommended capacity for Cassandra 1.2 and later is 3 to 5TB per node for uncompressed data. For Cassandra 1.1, it is 500 to 800GB per node. Be sure to account for replication.

Does Cassandra have a master node?

Cassandra has a ring-type architecture. Cassandra has no master nodes and no single point of failure. Cassandra supports network topology with multiple data centers, multiple racks, and nodes.

Does Netflix still use Cassandra?

Cassandra, with its distributed architecture, was a natural choice, and by 2013, most of Netflix's data was housed there, and Netflix still uses Cassandra today.


1 Answers

You are missing some parameters:

  1. Number of keyspaces.

  2. Number of how many nodes you want to use.

  3. Number of cores per node.

But, anyway:

  1. 4GB RAM is kind off minimal requirement. You will be able to run nodes, but there's will no place for OS level caching, Java heap will be very small etc. Most of the best practices guides suggest using 8/16GB configuration as minimum.

  2. 60GB of SSD - it depends on the amount of the data per server. If you planning to use STCS you should not go much beyond 50% disk usage, leaving us with 30GB. If you have very small data set you can live with it, but if you go higher, you should use more storage.

As a general advice, i'll suggest using servers with higher amount of RAM. It can be possible, theoretically, to run with such configuration in production, but it will probably make more problems than effort. Expect crashes, GC problems, out of memory errors, performance drops etc.

EDIT:

  1. 2 CPU Cores - is very low. Cassandra using CPU heavily during compaction process, compression, if enabled, reading data (more if compressed) etc. Try to get more cores if you can.

  2. 4GB RAM minimum - It doesn't depend on keyspace size, the absolute minimum is around 2GB AKAIK, but in most cases Cassandra will consume more, and regarding the fact that there's also OS running, it will be problematic to live with such a small amount. DataStax recccomend starting in production with 32GB, see http://docs.datastax.com/en/landing_page/doc/landing_page/planning/planningHardware.html

  3. Having 15 servers with 60GB storage, there's will be 900GB available. 100 keyspaces with size of GB is 100GB, so from the storage perspective you should be ok :).

like image 87
nevsv Avatar answered Oct 19 '22 23:10

nevsv