I am doing performance comparisons of ScyllaDB and Cassandra, specifically looking at the impact of memory. The machines I am using each have 16GB and 8 cores.
Based on the docs, Cassandra will default to 4GB Xmx and use the remaining 12GB as file system cache.
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsTuneJVM.html
ScyllaDB instead will use all 16GB for itself.
http://docs.scylladb.com/faq/#scylla-is-using-all-of-my-memory-why-is-that-what-if-the-server-runs-out-of-memory
What I'm wondering is if this is a fair comparison setup (4GB Xmx for Cassandra vs 16GB for Scylla)? I realize this is what each recommend, but would a more fair test be 8GB Xmx for Cassandra and --memory 8G for ScyllaDB? My workload is mostly write intensive and I don't expect file system caching to always be able to help Cassandra. It's odd to me that ScyllaDB does not expect almost any file system caching compared to Cassandra's huge reliance on it.
Cassandra will always use all of the system memory; the heap size (-Xmx) setting just determines how much is used by the heap and how much by other memory consumers (off-heap structures and the page cache). So if you limit Scylla's memory usage, it will be at a disadvantage compared to Cassandra.
Scylla will use ~1/2 of the memory for MemTable, and the other half for Key/Partition caching. If your workload is mostly write, more memory will have less of effect on performance, and should be bounded by either I/O or CPU.
I would recommend reading: http://www.scylladb.com/2017/10/05/io-access-methods-scylla/
To understand the way Scylla is writing information. And http://www.scylladb.com/2016/12/15/sswc-part1/ To understand the way Scylla is balancing I/O workloads
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With