We have a lot of keyspaces, RF=3, each keyspace has ~1Gb of data. Can we effectively serve such configuration with small nodes like 4GB RAM + 60GB SSD?
The minimal number should be 5, as a lower number (such as 3) will result in high stress on the machines during node failure (replication factor is 2 in this case, and each node will have to read 50% of the data and write 50% of data).
Maximum recommended capacity for Cassandra 1.2 and later is 3 to 5TB per node for uncompressed data. For Cassandra 1.1, it is 500 to 800GB per node. Be sure to account for replication.
Cassandra has a ring-type architecture. Cassandra has no master nodes and no single point of failure. Cassandra supports network topology with multiple data centers, multiple racks, and nodes.
Cassandra, with its distributed architecture, was a natural choice, and by 2013, most of Netflix's data was housed there, and Netflix still uses Cassandra today.
You are missing some parameters:
Number of keyspaces.
Number of how many nodes you want to use.
Number of cores per node.
But, anyway:
4GB RAM is kind off minimal requirement. You will be able to run nodes, but there's will no place for OS level caching, Java heap will be very small etc. Most of the best practices guides suggest using 8/16GB configuration as minimum.
60GB of SSD - it depends on the amount of the data per server. If you planning to use STCS you should not go much beyond 50% disk usage, leaving us with 30GB. If you have very small data set you can live with it, but if you go higher, you should use more storage.
As a general advice, i'll suggest using servers with higher amount of RAM. It can be possible, theoretically, to run with such configuration in production, but it will probably make more problems than effort. Expect crashes, GC problems, out of memory errors, performance drops etc.
EDIT:
2 CPU Cores - is very low. Cassandra using CPU heavily during compaction process, compression, if enabled, reading data (more if compressed) etc. Try to get more cores if you can.
4GB RAM minimum - It doesn't depend on keyspace size, the absolute minimum is around 2GB AKAIK, but in most cases Cassandra will consume more, and regarding the fact that there's also OS running, it will be problematic to live with such a small amount. DataStax recccomend starting in production with 32GB, see http://docs.datastax.com/en/landing_page/doc/landing_page/planning/planningHardware.html
Having 15 servers with 60GB storage, there's will be 900GB available. 100 keyspaces with size of GB is 100GB, so from the storage perspective you should be ok :).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With