Is there a way to get a row count (key count) of a single column family in Cassandra? get_count can only be used to get the column count.
For instance, if I have a column family containing users and wanted to get the number of users. How could I do it? Each user is it's own row.
A SELECT expression using COUNT(*) returns the number of rows that matched the query. Alternatively, you can use COUNT(1) to get the same result.
Counting with Cassandra Base Cassandra, without any of the extra DSE-added features, can already get counts in a few ways. Using CQL, Cassandra's query language, the syntax for a standard count is “SELECT COUNT(*) FROM keyspace. table;”.
If you are working on a large data set and are okay with a pretty good approximation, I highly recommend using the command:
nodetool --host <hostname> cfstats
This will dump out a list for each column family looking like this:
Column Family: widgets SSTable count: 11 Space used (live): 4295810363 Space used (total): 4295810363 Number of Keys (estimate): 9709824 Memtable Columns Count: 99008 Memtable Data Size: 150297312 Memtable Switch Count: 434 Read Count: 9716802 Read Latency: 0.036 ms. Write Count: 9716806 Write Latency: 0.024 ms. Pending Tasks: 0 Bloom Filter False Postives: 10428 Bloom Filter False Ratio: 1.00000 Bloom Filter Space Used: 18216448 Compacted row minimum size: 771 Compacted row maximum size: 263210 Compacted row mean size: 1634
The "Number of Keys (estimate)" row is a good guess across the cluster and the performance is a lot faster than explicit count approaches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With