I have a 4 node cluster with 16 core CPU and 100 GB RAM on each box (2 nodes on each rack).
As of now, all are running with default JVM settings of Cassandra (v2.1.4). With this setting, each node uses 13GB RAM and 30% CPU. It is a write heavy cluster with occasional deletes or updates.
Do I need to tune the JVM settings of Cassandra to utilize more memory? What all things should I be looking at to make appropriate settings?
The jvm- files replace the cassandra-envsh file used in Cassandra versions prior to Cassandra 3.0. The cassandra-env.sh bash script file is still useful if JVM settings must be dynamically calculated based on system settings. The jvm- files only store static JVM settings.
Offheap is manually managed memory, which is used for: Bloom filters: Used to quickly test if a SSTable contains a partition. Index summary: A search lookup of index positions. Compression metadata.
Set the memory available to the JVMTag(s): Environment-Xmx<size> the maximum Java heap size. The default value for the minimum is 2Mb, for the maximum it's 64Mb.
Do I need to tune the JVM settings of Cassandra to utilize more memory?
The DataStax Tuning Java Resources doc actually has some pretty sound advice on this:
Many users new to Cassandra are tempted to turn up Java heap size too high, which consumes the majority of the underlying system's RAM. In most cases, increasing the Java heap size is actually detrimental for these reasons:
- In most cases, the capability of Java to gracefully handle garbage collection above 8GB quickly diminishes.
- Modern operating systems maintain the OS page cache for frequently accessed data and are very good at keeping this data in memory, but can be prevented from doing its job by an elevated Java heap size.
If you have more than 2GB of system memory, which is typical, keep the size of the Java heap relatively small to allow more memory for the page cache.
As you have 100GB of RAM on your machines, (if you are indeed running under the "default JVM settings") your JVM max heap size should be capped at 8192M. And actually, I wouldn't deviate from that that unless you are experiencing issues with garbage collection.
JVM resources for Cassandra can be set in the cassandra-env.sh
file. If you are curious, look at the code for cassandra-env.sh
and look for the calculate_heap_sizes()
method. That should give you some insight as to how Cassandra computes your default JVM settings.
What all things should I be looking at to make appropriate settings?
If you are running OpsCenter (and you should be), add a graph for "Heap Used" and "Non Heap Used."
This will allow you to easily monitor JVM heap usage for your cluster. Another thing that helped me, was to write a bash script in which I basically hijacked the JVM calculations from cassandra-env.sh
. That way I can run it on a new machine, and know right away what my MAX_HEAP_SIZE
and HEAP_NEWSIZE
are going to be:
#!/bin/bash
clear
echo "This is how Cassandra will determine its default Heap and GC Generation sizes."
system_memory_in_mb=`free -m | awk '/Mem:/ {print $2}'`
half_system_memory_in_mb=`expr $system_memory_in_mb / 2`
quarter_system_memory_in_mb=`expr $half_system_memory_in_mb / 2`
echo " memory = $system_memory_in_mb"
echo " half = $half_system_memory_in_mb"
echo " quarter = $quarter_system_memory_in_mb"
echo "cpu cores = "`egrep -c 'processor([[:space:]]+):.*' /proc/cpuinfo`
#cassandra-env logic duped here
#this should help you to see how much memory is being allocated
#to the JVM
if [ "$half_system_memory_in_mb" -gt "1024" ]
then
half_system_memory_in_mb="1024"
fi
if [ "$quarter_system_memory_in_mb" -gt "8192" ]
then
quarter_system_memory_in_mb="8192"
fi
if [ "$half_system_memory_in_mb" -gt "$quarter_system_memory_in_mb" ]
then
max_heap_size_in_mb="$half_system_memory_in_mb"
else
max_heap_size_in_mb="$quarter_system_memory_in_mb"
fi
MAX_HEAP_SIZE="${max_heap_size_in_mb}M"
# Young gen: min(max_sensible_per_modern_cpu_core * num_cores, 1/4 * heap size)
max_sensible_yg_per_core_in_mb="100"
max_sensible_yg_in_mb=`expr ($max_sensible_yg_per_core_in_mb * $system_cpu_cores)`
desired_yg_in_mb=`expr $max_heap_size_in_mb / 4`
if [ "$desired_yg_in_mb" -gt "$max_sensible_yg_in_mb" ]
then
HEAP_NEWSIZE="${max_sensible_yg_in_mb}M"
else
HEAP_NEWSIZE="${desired_yg_in_mb}M"
fi
echo "Max heap size = " $MAX_HEAP_SIZE
echo " New gen size = " $HEAP_NEWSIZE
Update 20160212:
Also, be sure to check-out Amy Tobey's 2.1 Cassandra Tuning Guide. She has some great tips on how to get your cluster running optimally.
system_cpu_cores is not set properly. Edited the right one to execute.
#!/bin/bash
clear
echo "This is how Cassandra will determine its default Heap and GC Generation sizes."
system_memory_in_mb=`free -m | awk '/Mem:/ {print $2}'`
half_system_memory_in_mb=`expr $system_memory_in_mb / 2`
quarter_system_memory_in_mb=`expr $half_system_memory_in_mb / 2`
system_cpu_cores=`cat /proc/cpuinfo | grep -i processor | wc -l`
echo " memory = $system_memory_in_mb"
echo " half = $half_system_memory_in_mb"
echo " quarter = $quarter_system_memory_in_mb"
echo "cpu cores = `egrep -c 'processor([[:space:]]+):.*' /proc/cpuinfo`"
#cassandra-env logic duped here
#this should help you to see how much memory is being allocated
#to the JVM
if [ "$half_system_memory_in_mb" -gt "1024" ]
then
half_system_memory_in_mb="1024"
fi
if [ "$quarter_system_memory_in_mb" -gt "8192" ]
then
quarter_system_memory_in_mb="8192"
fi
if [ "$half_system_memory_in_mb" -gt "$quarter_system_memory_in_mb" ]
then
max_heap_size_in_mb="$half_system_memory_in_mb"
else
max_heap_size_in_mb="$quarter_system_memory_in_mb"
fi
MAX_HEAP_SIZE="${max_heap_size_in_mb}M"
# Young gen: min(max_sensible_per_modern_cpu_core * num_cores, 1/4 * heap size)
max_sensible_yg_per_core_in_mb="100"
max_sensible_yg_in_mb=`expr $max_sensible_yg_per_core_in_mb * $system_cpu_cores`
desired_yg_in_mb=`expr $max_heap_size_in_mb / 4`
if [ "$desired_yg_in_mb" -gt "$max_sensible_yg_in_mb" ]
then
HEAP_NEWSIZE="${max_sensible_yg_in_mb}M"
else
HEAP_NEWSIZE="${desired_yg_in_mb}M"
fi
echo "Max heap size = " $MAX_HEAP_SIZE
echo " New gen size = " $HEAP_NEWSIZE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With