Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra: how to get total table size / estimate row count

Intro

I'm trying to gather some stats from a Cassandra 1.2.6 cluster to implement a web service to provide those stats to a web app. I'm accessing the cluster from Python using the cql library, but I can ssh or pssh to the nodes as well.

The problem

My problem is how to get the total table size (i.e. the actual disk usage of each table) in the entire cluster, and if possible the total row count of each table (this can be an estimate).

The question

So far the only option I've found seems to be running nodetool cfstats on each node and parse the response, is there a better way of doing this?

Thanks in advance!

like image 473
Sergio Ayestarán Avatar asked Oct 08 '13 18:10

Sergio Ayestarán


1 Answers

I think the best way to do this would be to access the statistics directly through JMX (which is how nodetool actually works.) Each node provdies a wide range of metrics but what you would be interested in are.

org.apache.cassandra.metrics
  ColumnFamily
    cf_name
       TotalDiskSpaceUsed
       MemtableDataSize
like image 105
RussS Avatar answered Sep 21 '22 22:09

RussS