After running tpstats on all nodes. I see a lot of nodes having high number of ALL TIME BLOCKED NTR. We have a 4 node cluster and the values for NTR ALL TIME BLOCKED are :
NODE 1: 23953 NODE 2: 2935 NODE 3: 15229 NODE 4: 5951
I know ALL TIME BLOCKED is bad and hence worried as to what I am doing wrong.
Native transport requests (NTR) are any requests made via the CQL Native Protocol. CQL Native Protocol is the way the Cassandra driver communicates with the server. This includes all reads, writes, schema changes, etc. There are a limited number of threads available to process incoming requests.
The native transport is the CQL Native Protocol (as opposed to the Thrift Protocol) and is the way all modern Cassandra Driver's communicate with the server. This includes all reads/writes/schemachanges/etc ... A blocked request is one that is sitting around waiting for something else to complete before it can run.
This pool handles cql requests, so it is the number of active CQL requests allowed. Its limited to prevent too many active ones from OOMing your system (ie each returning large blobs). This effectively applies backpressure to your client application to slow down. Unfortunately if you have small requests this isnt ideal and hurts your throughput so in CASSANDRA-11363 they added a setting to make the space tradeoff for small bursty workloads.
If you upgrade to 2.2.8+ you can set the max queue size of that threadpool with -Dcassandra.max_queued_native_transport_requests=4096
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With