Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent Cassandra commit logs filling up disk space

I'm running a two node Datastax AMI cluster on AWS. Yesterday, Cassandra started refusing connections from everything. The system logs showed nothing. After a lot of tinkering, I discovered that the commit logs had filled up all the disk space on the allotted mount and this seemed to be causing the connection refusal (deleted some of the commit logs, restarted and was able to connect).

I'm on DataStax AMI 2.5.1 and Cassandra 2.1.7

If I decide to wipe and restart everything from scratch, how do I ensure that this does not happen again?

like image 730
plamb Avatar asked Jul 30 '15 20:07

plamb


People also ask

Can I delete Cassandra commit logs?

Commitlog segments can be archived, deleted, or recycled once all its data has been flushed to SSTables. Commitlog segments are truncated when Cassandra has written data older than a certain point to the SSTables.

Which directory contains the commit log file in Cassandra?

The commit log is archived at node startup and when a commit log is written to disk, or at a specified point-in-time. You configure this feature in the commitlog_archiving. properties configuration file, which is located in the following directories: Cassandra package installations: /etc/cassandra.

What is Memtable and SSTable in Cassandra?

Memtable — a memory cache to store the in memory copy of the data. Each node has a memtable for each CQL table. The memtable accumulates writes and provides read for data which are not yet stored to disk. SSTable —the final destination of data in C*. They are actual files on disk and are immutable.

What is a commit log?

A commit log is a record of transactions. It's used to keep track of what's happening, and help with e.g. disaster recovery - generally, all commits are written to the log before being applied, so transactions that were in flight when the server went down can be recovered and re-applied by checking the log.


2 Answers

You could try lowering the commitlog_total_space_in_mb setting in your cassandra.yaml. The default is 8192MB for 64-bit systems (it should be commented-out in your .yaml file... you'll have to un-comment it when setting it). It's usually a good idea to plan for that when sizing your disk(s).

You can verify this by running a du on your commitlog directory:

$ du -d 1 -h ./commitlog
8.1G    ./commitlog

Although, a smaller commit log space will cause more frequent flushes (increased disk I/O), so you'll want to keep any eye on that.

Edit 20190318

Just had a related thought (on my 4-year-old answer). I saw that it received some attention recently, and wanted to make sure that the right information is out there.

It's important to note that sometimes the commit log can grow in an "out of control" fashion. Essentially, this can happen because the write load on the node exceeds Cassandra's ability to keep up with flushing the memtables (and thus, removing old commitlog files). If you find a node with dozens of commitlog files, and the number seems to keep growing, this might be your issue.

Essentially, your memtable_cleanup_threshold may be too low. Although this property is deprecated, you can still control how it is calculated by lowering the number of memtable_flush_writers.

memtable_cleanup_threshold = 1 / (memtable_flush_writers + 1)

The documentation has been updated as of 3.x, but used to say this:

# memtable_flush_writers defaults to the smaller of (number of disks,
# number of cores), with a minimum of 2 and a maximum of 8.
# 
# If your data directories are backed by SSD, you should increase this
# to the number of cores.
#memtable_flush_writers: 8

...which (I feel) led to many folks setting this value WAY too high.

Assuming a value of 8, the memtable_cleanup_threshold is .111. When the footprint of all memtables exceeds this ratio of total memory available, flushing occurs. Too many flush (blocking) writers can prevent this from happening expediently. With a single /data dir, I recommend setting this value to 2.

like image 199
Aaron Avatar answered Sep 23 '22 13:09

Aaron


In addition to decreasing the commitlog size as suggested by BryceAtNetwork23, a proper solution to ensure it won't happen again will have monitoring of the disk setup so that you are alerted when its getting full and have time to act/increase the disk size.

Seeing as you are using DataStax, you could set an alert for this in OpsCenter. Haven't used this within the cloud myself, but I imagine it would work. Alerts can be set by clicking Alerts in the top banner -> Manage Alerts -> Add Alert. Configure the mounts to watch and the thresholds to trigger on.

Or, I'm sure there are better tools to monitor disk space out there.

like image 40
Alec Collier Avatar answered Sep 25 '22 13:09

Alec Collier