Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I query a Cassandra cluster for its metadata?

We have a process creatively named "bootstrap" that sets up our Cassandra clusters for a given rev of software in an environment (Dev1, Dev2, QA, ..., PROD). This bootstrap Creates/Updates keyspaces and column families as well as populating initial data in non-prod.

We are using Astyanax, but we could use Hector for bootstrapping.

Given that another team has decided that each environment will have its own datacenter names. And Given that I want this to work in prod when we go from two to more datacenters. And Given that we will be using PropertyFileSnitch:

How can I ask the Cassandra cluster for its layout? (Without shelling to nodetool ring)

Specifically, I need to know the names of the datacenters so I can Create or Update a keyspace with the correct settings for strategy options when using NetworkTopologyStrategy. We want 3 copies per datacenter. Some envs have one and several have two, eventually production will have more.

Is there CQL or a Thrift call that will give me info about the cluster layout?

I have looked though several TOCs in various doc sets, and googled a bit. I thought I would ask here before digging though the nodetool code.

like image 262
Frobbit Avatar asked Oct 07 '22 06:10

Frobbit


1 Answers

I'm not sure how Hector or Astyanax expose this, but the basic Thrift method describeRing(keyspace) should give you what you're looking for. Part of the information that it contains are EndpointDetails structs that look like this:

endpoint_details=[EndpointDetails(datacenter='datacenter1', host='127.0.0.1', rack='rack1')]

Along with the rest of the results from that method, you should be able to figure out tokens, DCs, racks, and so on, for each node in the cluster.

Since you're using a Java client, you could also use some of the JMX methods (which nodetool uses) to describe more select parts of the cluster. For example, you might look at the snitch mbean ("org.apache.cassandra.db:type=EndpointSnitchInfo"), specifically the getDatacenter(ip) and getRack(ip) methods.

like image 93
Tyler Hobbs Avatar answered Oct 10 '22 02:10

Tyler Hobbs