Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra nodes cannot communicate with each other, cause ReadTimeout

Tags:

This is on Datastax Cassandra (dse) version: 4.8.5-1
This corresponds (I believe) to Cassandra: 2.1.x

I'm getting a lot of the following errors when querying from our application:

ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'data_retrieved': False, 'required_responses': 1, 'consistency': 1}

Digging into this more; a sample query (run using cqlsh locally on each node) returns on 3 of the nodes in the ring but fails with a ReadTimeout on the rest. It seems like only the nodes containing the replicas return with a response, while the rest don't know how to find them at all.

Is there some configuration or known issue I should be looking at to fix this issue?

When the other nodes fail, I see this error in the logs:

ERROR [MessagingService-Outgoing-/10.0.10.14] 2016-04-25 20:46:46,818  CassandraDaemon.java:229 - Exception in thread Thread[MessagingService-Outgoing-/10.0.10.14,5,
main]
java.lang.AssertionError: 371205
        at org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:290) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.db.composites.AbstractCType$Serializer.serialize(AbstractCType.java:393) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.db.composites.AbstractCType$Serializer.serialize(AbstractCType.java:382) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.db.filter.ColumnSlice$Serializer.serialize(ColumnSlice.java:271) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.db.filter.ColumnSlice$Serializer.serialize(ColumnSlice.java:259) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.db.filter.SliceQueryFilter$Serializer.serialize(SliceQueryFilter.java:503) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.db.filter.SliceQueryFilter$Serializer.serialize(SliceQueryFilter.java:490) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.db.SliceFromReadCommandSerializer.serialize(SliceFromReadCommand.java:168) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.db.ReadCommandSerializer.serialize(ReadCommand.java:143) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.db.ReadCommandSerializer.serialize(ReadCommand.java:132) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:121) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.net.OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:330) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:282) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]
        at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:218) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131]

Nodetool status output

Datacenter: primary
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens  Owns    Host ID                               Rack
UN  10.0.10.224  557.95 GB  1       ?       d1b984b0-50d4-4faa-b349-08bc0cf36447  RAC1
UN  10.0.10.225  740.11 GB  1       ?       16ab3c8c-476e-46c2-837c-6dbb89b7d40d  RAC1
UN  10.0.10.12   748.23 GB  1       ?       4127f0d7-6bd0-4dc8-b6a0-3b261e55b44e  RAC1
UN  10.0.10.45   629.27 GB  1       ?       f4499c5d-f892-43b8-97f3-dcce5be51fb8  RAC2
UN  10.0.10.13   592.57 GB  1       ?       41b58044-942d-4e77-a8de-95495b88a073  RAC1
UN  10.0.10.14   616.45 GB  1       ?       d2b568fb-13e1-4ff7-a247-3751a8ca49cf  RAC1
UN  10.0.10.15   623.23 GB  1       ?       fb10e521-8359-409b-bfd8-b27829157a80  RAC1
UN  10.0.10.21   538.56 GB  1       ?       72288b4c-bd1d-4398-9d95-5af312c2f904  RAC2
UN  10.0.10.25   616.63 GB  1       ?       4a8f04ff-a198-44d1-baf4-72cc430cd8a9  RAC2
UN  10.0.10.218  562.98 GB  1       ?       c00c375d-90bb-48c5-a8d0-7102a13db468  RAC2
UN  10.0.10.219  632.58 GB  1       ?       1e2ea144-35bd-412b-89b5-41544a347a75  RAC2
UN  10.0.10.220  746.85 GB  1       ?       d40f59c1-430a-4d96-9d7e-1e846b8eb1fc  RAC2
UN  10.0.10.221  575.89 GB  1       ?       7e407d6b-2bd5-43b4-9116-96ee72a926b2  RAC2
UN  10.0.10.222  639.98 GB  1       ?       bfd04ab8-7679-4474-8d47-984950bdd2c7  RAC1
UN  10.0.10.223  652.58 GB  1       ?       6366cd3e-7910-40bb-8a12-926c53adf95b  RAC1

The code for this assertion is here:

http://grepcode.com/file/repo1.maven.org/maven2/org.apache.cassandra/cassandra-all/2.1.1/org/apache/cassandra/utils/ByteBufferUtil.java?av=f#290

  • There's no obvious schema mismatch when looking at either the system.local or system.peers tables.
  • nodetool describecluster returns UNREACHABLE from some nodes
like image 559
c4urself Avatar asked Apr 25 '16 19:04

c4urself


People also ask

How Cassandra nodes communicate with each other?

Cassandra uses a protocol called gossip to discover location and state information about the other nodes participating in a Cassandra cluster. Gossip is a peer-to-peer communication protocol in which nodes periodically exchange state information about themselves and about other nodes they know about.

What happens when a Cassandra node goes down?

When a node comes back online after an outage, it may have missed writes for the replica data it maintains. Repair mechanisms exist to recover missed data, such as hinted handoffs and manual repair with nodetool repair. The length of the outage will determine which repair mechanism is used to make the data consistent.

How is node failure detected in Cassandra?

Rather than have a fixed threshold for marking failing nodes, Cassandra uses an accrual detection mechanism to calculate a per-node threshold that takes into account network performance, workload, and historical conditions.

How Cassandra will behave when one node in the ring goes down?

Cassandra nodes always forms a ring, where each node communicate with their neighboring nodes, i.e. for any node they always gossip with two other nodes. When a node goes down, the ring is broken.


1 Answers

You are probably hitting the 64K max key size limit, http://wiki.apache.org/cassandra/FAQ#max_key_size

Look for your application code, probably somebody sending cassandra 371205 byte long data as a primary key, maybe somebody trying to crack your application i don't know, because highly unlikely 370k data as primary key is sensible, restrict this in your application code,

I don't know if any bug or fix or workaround exists about this.

like image 131
hll Avatar answered Sep 28 '22 01:09

hll