Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understand Cassandra pooling options (setCoreConnectionsPerHost and setMaxConnectionsPerHost)?

I recently started working with Cassandra and I was reading more about connection pooling here. I was confuse about pool size and couldn't understand what does this mean here:

poolingOptions
    .setCoreConnectionsPerHost(HostDistance.LOCAL,  4)
    .setMaxConnectionsPerHost( HostDistance.LOCAL, 10)
    .setCoreConnectionsPerHost(HostDistance.REMOTE, 2)
    .setMaxConnectionsPerHost( HostDistance.REMOTE, 4)
    .setMaxRequestsPerConnection(2000);

Below is what I want to understand in detail:

  1. I would like to know what does setCoreConnectionsPerHost, setMaxConnectionsPerHost and setMaxRequestsPerConnection means?
  2. What is LOCAL and REMOTE means here?

If someone can explain with an example then it will really help me understand better.

We have a 6 nodes cluster all in one dc with RF as 3 and we read/write as local quorum.

like image 838
dragons Avatar asked Jun 02 '20 03:06

dragons


Video Answer


1 Answers

Cassandra protocol allows to submit for execution multiple queries over the same network connection in parallel, without waiting for answer. The setMaxRequestsPerConnection sets how many in-flight queries could be in one connection simultaneously - maximal limit depends on protocol, and since protocol v3, it's 32k, but in reality you need to keep it around 1000-2000 - if you have more, then it's a sign that server is not keeping with your queries.

Drivers are opening connections to every node in the cluster, and these connections are marked either as LOCAL - if they are to the nodes in the data center that is local to the application (either set explicitly in load balancing policy, or inferred from first contacted point), or as REMOTE if they are to the nodes that in the other data centers.

Also, driver can open several connections to nodes. And there are 2 values that control their number: core - the minimal number of connections, and max - what is the upper limit. Driver will open new connections if you submit new requests that doesn't fit into the existing limit.

So in your example:

poolingOptions
    .setCoreConnectionsPerHost(HostDistance.LOCAL,  4)
    .setMaxConnectionsPerHost( HostDistance.LOCAL, 10)
    .setCoreConnectionsPerHost(HostDistance.REMOTE, 2)
    .setMaxConnectionsPerHost( HostDistance.REMOTE, 4)
    .setMaxRequestsPerConnection(2000);
  • for local data center, it will open 4 connections per node initially, and it may grow up to 10 connections
  • for other data centers it will open 2 connections, that could grow up to 4 connections
like image 171
Alex Ott Avatar answered Sep 23 '22 16:09

Alex Ott