Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra: bigger replication factor = faster reads?

Tags:

cassandra

Does increasing replication factor on a cluster also increase the read speed?

I understand that when replication factor is 1, and there's 6 nodes and the tokens are distributed equally - then it's only 16,66% chance that given node has the data, if it does not have it, it asks the node responsible and that takes extra time.

I guess that with replication factor set to 6, each node has the full dataset and can fetch data immediately without asking other nodes (we're using read consistency=1). So increasing replication factor should increase reading speed. Is this correct?

Our app has relatively few writes but more than 10k get() operations per second. We have 6 nodes in the cluster and we need all read operations to be extremely fast, that's why we're looking for a way to improve cassandra's read performance.

like image 974
PawelRoman Avatar asked Oct 10 '12 16:10

PawelRoman


2 Answers

That's correct, as long as you're using ConsistencyLevel.ONE.

like image 101
jbellis Avatar answered Sep 29 '22 12:09

jbellis


I actually run the ycsb benchmarks- 100% write and 100% Read- to test this. Increasing the replication factor seems to be causing slower reads while the consistency level is kept at one.

In an 8 node cluster here are the numbers I am getting:

16 million read operations-ycsb workload C

rep.factor _ readtime(min)

1 _ 10.8840833333333

2 _ 11.1243666666667

4 _ 17.4050333333333

For greater sizes the jump is even bigger.

Can anyone explain why?

like image 39
elif Avatar answered Sep 29 '22 12:09

elif