In "Cassandra The Definitive Guide" (2nd edition) by Jeff Carpenter & Eben Hewitt, the following formula is used to calculate the size of a table on disk (apologies for the blurred part):
There are two things I don't understand in this equation.
First: why do clustering columns size gets counted for every regular column? Shouldn't we multiply it by the number of rows? It seems to me that by calculating this way, we're saying that the data in each clustering column, gets replicated for each regular column, which I suppose is not the case.
Second: why do primary key columns don't get multiplied by the number of partitions? From my understanding, if we have a node with two partitions, then we should multiply the size of the primary key columns by two because we'll have two different primary keys in that node.
If you need to know informaiton about table or tables you can use Nodetool cfstats command. Syntax: If you will only provide the name of keyspace, it will provide stats for all the tables in that keyspace.
Partition size is measured by the number of cells (values) that are stored in the partition. Cassandra's hard limit is 2 billion cells per partition, but you'll likely run into performance issues before reaching that limit.
As mentioned earlier, a table is also called Column Family in the earlier versions of Cassandra. It is still referred to as column family in some of the error messages and documents of Cassandra. It is important to define a primary key for a table.
As the author, I greatly appreciate the question and your engagement with the material!
With respect to the original questions - remember that this is not the formula to calculate the size of the table, it is the formula to calculate the size of a single partition. The intent is to use this formula with "worst case" number of rows to identify overly large partitions. You'd need to multiply the result of this equation by the number of partitions to get an estimate of total data size for the table. And of course this does not take replication into account.
Also thanks to those who responded to the original question. Based on your feedback I spent some time looking at the new (3.0) storage format to see whether that might impact the formula. I agree that Aaron Morton's article is a helpful resource (link provided above).
The basic approach of the formula remains sound for the 3.0 storage format. The way the formula works, you're basically adding:
Updating the formula for the 3.0 storage format requires revisiting the constants. For example, the original equation assumes 8 bytes of metadata per cell to store a timestamp. The new format treats the timestamp on a cell as optional since it can be applied at the row level. For this reason, there is now a variable amount of metadata per cell, which could be as low as 1-2 bytes, depending on the data type.
After reading this feedback and rereading that section of the chapter, I plan to update the text to add some clarifications as well as stronger caveats about this formula being useful as an approximation rather than an exact value. There are factors it doesn't account for at all such as writes being spread over multiple SSTables, as well as tombstones. We're actually planning another printing this spring (2017) to correct a few errata, so look for those changes soon.
It's because of Cassandra's version < 3 internal structure.
Let's take an example :
CREATE TABLE my_table (
pk1 int,
pk2 int,
ck1 int,
ck2 int,
d1 int,
d2 int,
s int static,
PRIMARY KEY ((pk1, pk2), ck1, ck2)
);
Insert some dummy data :
pk1 | pk2 | ck1 | ck2 | s | d1 | d2
-----+-----+-----+------+-------+--------+---------
1 | 10 | 100 | 1000 | 10000 | 100000 | 1000000
1 | 10 | 100 | 1001 | 10000 | 100001 | 1000001
2 | 20 | 200 | 2000 | 20000 | 200000 | 2000001
Internal structure will be :
|100:1000: |100:1000:d1|100:1000:d2|100:1001: |100:1001:d1|100:1001:d2|
-----+-------+-----------+-----------+-----------+-----------+-----------+-----------+
1:10 | 10000 | | 100000 | 1000000 | | 100001 | 1000001 |
|200:2000: |200:2000:d1|200:2000:d2|
-----+-------+-----------+-----------+-----------+
2:20 | 20000 | | 200000 | 2000000 |
So size of the table will be :
Single Partition Size = (4 + 4 + 4 + 4) + 4 + 2 * ((4 + (4 + 4)) + (4 + (4 + 4))) byte = 68 byte
Estimated Table Size = Single Partition Size * Number Of Partition
= 68 * 2 byte
= 136 byte
More : http://opensourceconnections.com/blog/2013/07/24/understanding-how-cql3-maps-to-cassandras-internal-data-structure/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With