I was listening to this talk on Data modelling in Cassandra. The speakers makes the general statement that 'writes are faster than reads in Cassandra'.
Is this case always true? if so why?
Cassandra is very performant on reads when compared to other storage systems, even for read-heavy workloads. As in any database, reads are best when the hot working set fits into memory.
How is data written? Cassandra appends writes to the commit log on disk. The commit log receives every write made to a Cassandra node and these durable writes survive permanently even if power fails on a node. Cassandra also stores the data in a memory structure called memtable and to provide configurable durability.
For write operations, the write consistency level specified how many replicas must respond to a write request before the write is considered successful. Even at low consistency levels, Cassandra writes to all replicas of the partition key, including replicas in other datacenters.
That's still true even though is not a big difference like in past. A write in general perform better because it doesn't involve too much the I/O -- a write operation is completed when the data has been both written in the commit log (file) and in memory (memtable). When the memtable reach the max size then all table is flushed in a disk sstable. Differently a read may require more I/O for different reasons. A read operation first involve reading from a bloom filter (a filter associated to sstable that might save I/O time saying that a data is surely not present in the associated sstable) and then, if filter returns a positive value, Cassandra starts seeking the sstable to look for data. HTH, Carlo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With