Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does cassandra flush memtables on nodetool stopdaemon. If not what to do to avoid data loss

I am using apache-cassandra-3.10

I understand instead of kill -9 pid, the only way to stop cassandra gracefully is nodetool stopdaemon.

But I want to know if nodetool stopdaemon also flushes the data in the memtables to sstables before shutdown.

If it does not flush then it would lead to data loss, when I stop the node using nodetool stopdaemon.

Also after researching on this , I read about the DURABLE_WRITES. What does durable write actually do ?

Also , the datastax documentation states under the section Setting DURABLE_WRITES "Do not set this attribute on a keyspace using the SimpleStrategy"

reference : https://docs.datastax.com/en/cql/3.1/cql/cql_reference/create_keyspace_r.html

What if my keyspace is configured with Simple Strategy , I still cannot benefit with DURABLE_WRITES in case it can help with data loss on shutdown ?

Is manually running nodetool flush before shutdown, the only way to make sure we do not lose data on shutdown ?

I read from https://issues.apache.org/jira/browse/CASSANDRA-3564 that the functionality to flush at shutdown has not been added.

Also there is a open ticket on the same issue https://issues.apache.org/jira/browse/CASSANDRA-12001

Intention is to avoid any data loss at shut down using nodetool stopdaemon. Basically flush all tables before shutdown , Considering Simple-strategy in use.

like image 312
Syed Ammar Mustafa Avatar asked Mar 21 '17 11:03

Syed Ammar Mustafa


People also ask

What does Nodetool flush do?

Flushes one or more tables from the memtable to SSTables on disk. Flushes one or more tables from the memtable to SSTables on disk. OpsCenter provides a flush option for Flushing tables in Nodes.

What is Cassandra flush?

To flush the data, Cassandra sorts memtables by partition key and then writes the data to disk sequentially. The process is extremely fast because it involves only a commitlog append and the sequential write. Data in the commit log is purged after its corresponding data in the memtable is flushed to the SSTable.

Where are Memtables stored in Cassandra?

yaml. If Memtable size is not configured in then Cassandra assigns 1/4th of max heap size allocated to Cassandra process. For ex: if max size is 64GB then 16GB is set as Memtable size. By default Cassandra is configured to store Memtable data in heap space.

How long does Nodetool drain take?

how can we reduce the nodetool drain command run time.


1 Answers

nodetool drain will suffice.
From Datastax Documentation about nodeool drain,

Flushes all memtables from the node to SSTables on disk. Cassandra stops listening for connections from the client and other nodes. You need to restart Cassandra after running nodetool drain.
link: nodetool drain

Then you can either kill or run nodetool stopdaemon.

like image 78
observer Avatar answered Oct 05 '22 19:10

observer