Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra - client side load balancing

Tags:

cassandra

Consider following Cassandra setup:

  • ring of 6 nodes: A, B, D, E, F, G
  • replication factor: 3
  • partitioner: RandomPartitioner
  • placement strategy: SimpleStrategy

My Test-Column is stored on node B and replicated to nodes D and E.

Now I have multiple java processes reading my Test-Column trough Hector API (Thrift) with read CL.ONE

There are two possibilities:

  1. Hector will forward all calls to node B, because B is the data master
  2. Hector will load balance read calls trough node B, D and E (master and replicates). In this case my test column would be loaded into cache on each Cassandra instance.

Which one is it 1) or 2) ?

Thanks and regards, Maciej

like image 436
Maciej Miklas Avatar asked Nov 16 '11 13:11

Maciej Miklas


1 Answers

I believe it is: 3) Cassandra forwards all calls to the closest node that is alive, where "closeness" is determined by the Snitch currently being used (set in cassandra.yaml).

  • SimpleSnitch chooses the closest node on the token ring.
  • AbstractNetworkTopologySnitch and derived snitches first try to choose nodes in the same rack, then nodes in the same datacenter.

If DynamicSnitch is enabled, it dynamically adjusts the node closeness returned by the underlying snitch, according to the nodes' recent performance.

See Cassandra ArchitectureInternals under "Read Path" for more information.

like image 72
Theodore Hong Avatar answered Sep 28 '22 05:09

Theodore Hong