Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra commit log clarification

I have read over several documents regarding the Cassandra commit log and, to me, there is conflicting information regarding this "structure(s)". The diagram shows that when a write occurs, Cassandra writes to the memtable and commit log. The confusing part is where this commit log resides.

The diagram that I've seen over-and-over shows the commit log on disk. However, if you do some more reading, they also talk about a commit log buffer in memory - and that piece of memory is flushed to disk every 10 seconds.

DataStax Documentation states: "When a write occurs, Cassandra stores the data in a memory structure called memtable, and to provide configurable durability, it also appends writes to the commit log buffer in memory. This buffer is flushed to disk every 10 seconds".

Nowhere in their diagram do they show a memory structure called a commit log buffer. They only show the commit log residing on disk.

It also states: "When a write occurs, Cassandra stores the data in a structure in memory, the memtable, and also appends writes to the commit log on disk."

So I'm confused by the above. Is it written to the commit log memory buffer, which is eventually flushed to disk (which I would assume is also called the "commit log"), or is it written to the memtable and commit log on disk?

Apache's documentation states this: "Instead, like other modern systems, Cassandra provides durability by appending writes to a commitlog first. This means that only the commitlog needs to be fsync'd, which, if the commitlog is on its own volume, obviates the need for seeking since the commitlog is append-only. Implementation details are in ArchitectureCommitLog.

Cassandra's default configuration sets the commitlog_sync mode to periodic, causing the commitlog to be synced every commitlog_sync_period_in_ms milliseconds, so you can potentially lose up to that much data if all replicas crash within that window of time."

What I have inferred from the Apache statement is that ONLY because of the asynchronous nature of writes (acknowledgement of a cache write) could you lose data (it even states you can lose data if all replicas crash before it is flushed/sync'd).

I'm not sure what I can infer from the DataStax documentation and diagram as they've mentioned two different statements regarding the commit log - one in memory, one on disk.

Can anyone clarify, what I consider, a poorly worded and conflicting set of documentation?

I'll assume there is a commit log buffer, as they both reference it (yet DataStax doesn't show it in the diagram). How and when this is managed, I think, is a key to understand.

like image 971
Jim Wartnick Avatar asked Jul 21 '16 14:07

Jim Wartnick


People also ask

What is commit log in Cassandra?

Commitlogs are an append only log of all mutations local to a Cassandra node. Any data written to Cassandra will first be written to a commit log before being written to a memtable. This provides durability in the case of unexpected shutdown. On startup, any mutations in the commit log will be applied.

Can I delete Cassandra commit logs?

1 Answer. The commit logs are the commits for incoming writes to your cluster from the application. You don't delete them.

Which directory contains the commit log file in Cassandra?

The commit log is archived at node startup and when a commit log is written to disk, or at a specified point-in-time. You configure this feature in the commitlog_archiving. properties configuration file, which is located in the following directories: Cassandra package installations: /etc/cassandra.

What is true about Memtable in Cassandra?

The memtable is a write-back cache of data partitions that Cassandra looks up by key. The memtable stores writes in sorted order until reaching a configurable limit, and then is flushed.


1 Answers

Generally when explaining the write path, the commit log is characterized as a file - and it's true the commit log is the on-disk storage mechanism that provides durability. The confusion is introduced when going deeper and the part about buffer cache and having to issue fsyncs is introduced. The reference to "commit log buffer in memory" is talking about OS buffer cache, not a memory structure in Cassandra. You can see in the code that there's not a separate in-memory structure for the commit log, but rather the mutation is serialized and written to a file-backed buffer.

Cassandra comes with two strategies for managing fsync on the commit log.

commitlog_sync 
    (Default: periodic) The method that Cassandra uses to acknowledge writes in milliseconds:
    periodic: (Default: 10000 milliseconds [10 seconds])
    Used with commitlog_sync_period_in_ms to control how often the commit log is synchronized to disk. Periodic syncs are acknowledged immediately.

    batch: (Default: disabled)note
    Used with commitlog_sync_batch_window_in_ms (Default: 2 ms) to control how long Cassandra waits for other writes before performing a sync. When using this method, writes are not acknowledged until fsynced to disk.

The periodic offers better performance at the cost of a small increase in the chance that data can be lost. The batch setting guarantees durability at the cost of latency.

like image 171
Andrew Weaver Avatar answered Sep 24 '22 09:09

Andrew Weaver