Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python cql driver - cassandra.ReadTimeout - "Operation timed out - received only 1 responses."

I am using Cassandra 2.0 with python CQL.

I have created a column family as follows:

CREATE KEYSPACE IF NOT EXISTS Identification
  WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
  'DC1' : 1 };

USE Identification;

CREATE TABLE IF NOT EXISTS entitylookup (
  name varchar,
  value varchar,
  entity_id uuid,
  PRIMARY KEY ((name, value), entity_id))
WITH
    caching=all
;

I then try to count the number of records in this CF as follows:

#!/usr/bin/env python
import argparse
import sys
import traceback
from cassandra import ConsistencyLevel
from cassandra.cluster import Cluster
from cassandra.query import SimpleStatement

def count(host, cf):    
    keyspace = "identification"
    cluster = Cluster([host], port=9042, control_connection_timeout=600000000)
    session = cluster.connect(keyspace)
    session.default_timeout=600000000

    st = SimpleStatement("SELECT count(*) FROM %s" % cf, consistency_level=ConsistencyLevel.ALL)
    for row in session.execute(st, timeout=600000000):
        print "count for cf %s = %s " % (cf, str(row))
    dump_pool.close()
    dump_pool.join()

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("-cf", "--column-family", default="entitylookup", help="Column Family to query")
    parser.add_argument("-H", "--host", default="localhost", help="Cassandra host")    
    args = parser.parse_args()

    count(args.host, args.column_family)

    print "fim"

The count is not that useful to me, it's just a test with an operation that takes long to complete.

Although I have defined timeout as 600000000 seconds, after less than 30 seconds I get the following error:

./count_entity_lookup.py  -H localhost -cf entitylookup 
    Traceback (most recent call last):
      File "./count_entity_lookup.py", line 27, in <module>
        count(args.host, args.column_family)
      File "./count_entity_lookup.py", line 16, in count
        for row in session.execute(st, timeout=None):
      File "/home/mvalle/pyenv0/local/lib/python2.7/site-packages/cassandra/cluster.py", line 1026, in execute
        result = future.result(timeout)
      File "/home/mvalle/pyenv0/local/lib/python2.7/site-packages/cassandra/cluster.py", line 2300, in result
        raise self._final_exception
    cassandra.ReadTimeout: code=1200 [Timeout during read request] message="Operation timed out - received only 1 responses." info={'received_responses': 1, 'data_retrieved': True, 'required_responses': 2, 'consistency': 5}

It seems the answer was found in just a replica, but this really doesn't make sense to me. Should't cassandra be able to query it anyway?

In the image bellow, it's possible to see that the amount of requests to the cluster was really low and the latency low as well. I am not sure why is this happening.

enter image description here

like image 442
mvallebr Avatar asked May 30 '14 19:05

mvallebr


Video Answer


1 Answers

From the response:

received_responses': 1, 'data_retrieved': True, 'required_responses': 2

Data was only available on one node while the query is requiring consistency==all. Cassandra was not able to fulfill that request and timed out.

You may change the write consistency to 'ALL' if it is required that all nodes have the data.

That would ensure all read requests can be satisfied without consistency==ALL as that would be satisfied by the write request it self, though writes may fail if a node is off line.

See documentation for explanation of what each consistency level means.

LOCAL_QUORUM is what would be used to ensure majority of nodes with respect to replication factor are contacted within a DC.

like image 199
danny Avatar answered Oct 28 '22 00:10

danny