Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Token Aware Astyanax Connection pool connecting on nodes without distributing connections over nodes

I was using astyanax connection pool defined as this:

ipSeeds = "LOAD_BALANCER_HOST:9160";
conPool.setSeeds(ipSeeds)
.setDiscoveryType(NodeDiscoveryType.TOKEN_AWARE)
.setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE);

However, my cluster have 4 nodes and I have 8 client machines connecting on it. LOAD_BALANCER_HOST forwards requests to one of my four nodes.

On a client node, I have:

$netstat -an | grep 9160 | awk '{print $5}' | sort |uniq -c
    235 node1:9160
    680 node2:9160
      4 node3:9160
      4 node4:9160

So although the ConnectionPoolType is TOKEN_AWARE, my client seems to be connecting mainly to node2, sometimes to node1, but almost never to nodes 3 and 4.
Question is: Why is this happening? Shouldn't a token aware connection pool query the ring for the node list and connect to all the active nodes using round robin algorithm?

like image 551
mvallebr Avatar asked Nov 01 '22 08:11

mvallebr


1 Answers

William Price is totally right: the fact you're using a TokenAwarePolicy and possibly a default Partitioner means that - first your data will be stored biased across your nodes and - then on querying the LoadbalancingPolicy makes your driver remember the correct nodes to ask for

You can improve your cluster's performance by using some deviating or may be a custom partitioner to equally distribute your data. To randomly query nodes use either

  • RoundRobinPolicy (http://www.datastax.com/doc-source/developer/java-apidocs/com/datastax/driver/core/policies/RoundRobinPolicy.html) or
  • DatacenterAwareRoundRobinPolicy (http://www.datastax.com/doc-source/developer/java-apidocs/com/datastax/driver/core/policies/DCAwareRoundRobinPolicy.html).

The latter, of course, needs the definition of data centers in your keyspace.

Without any further information I would suggest to just change the partitioner as a TokenAware load balancing policy is usually a good idea. The main load will end up on these nodes in the end -- the TokenAware policy get's you to the right coordinator just quicker.

like image 75
Daniel Schulz Avatar answered Nov 15 '22 12:11

Daniel Schulz