Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra in-memory configuration

Tags:

cassandra

We currently evaluate the use of Apache Cassandra 1.2 as a large scale data processing solution. As our application is read-intensive and to provide users with the fastest possible response time we would like to configure Apache Cassandra to keep all data in-memory.

Is it enough to set the storage option caching to rows_only on all column families and giving each Cassandra node sufficient memory to hold its data portion? Or are there other possibilities for Cassandra ?

like image 847
user1977546 Avatar asked Jan 14 '13 14:01

user1977546


People also ask

Is Cassandra an in-memory database?

Redis versus Cassandra While both are NoSQL databases, Redis is an in-memory data store that supports many different data types, used as a database, cache, and message broker. On the other hand, Cassandra is a distributed key-value store.

How much RAM does Cassandra need?

While Cassandra can be made to run on small servers for testing or development environments (including Raspberry Pis), a minimal production server requires at least 2 cores, and at least 8GB of RAM. Typical production servers have 8 or more cores and at least 32GB of RAM.

How do you increase Cassandra's memory?

To increase the stack size, uncomment and modify the default setting in the cassandra-env.sh file. Also, decreasing the memtable space to make room for Solr caches can improve performance. Modify the memtable space by changing the memtable_heap_space_in_mb and memtable_offheap_space_in_mb properties in the cassandra.

How do I check my Cassandra memory usage?

yes Cassandra provides a command line interface for management. You can use nodetool -u username -pw ********** info (if jmx is enabled ) command to get information about how much Heap and Off Heap Memory is used.


1 Answers

Read performance tuning is much complex than write. Base on my experiences, there are some factors you can take into consideration. Some point of view are not memory related, but they also help improve the read performance.

1.Row Cache: avoid disk hit, but enable it only if the rows are not updated frequently. You could also enable the off-heap row cache to reduce the JVM heap usage.

2.Key Cache: enable by default, no need to disable it. It avoid disk searching when row cache is not hit.

3.Reduce the frequency of memtable flush: adjust memtable_total_space_in_mb, commitlog_total_space_in_mb, flush_largest_memtables_at

4.Using LeveledCompactionStrategy: avoid a row spread across multiple SSTables.

like image 109
Stanley Wang Avatar answered Sep 22 '22 21:09

Stanley Wang