Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scylladb : clustering key cartesian product size 600 is greater than maximum 100

Tags:

scylla

I am using data stax java driver to query scylladb , i see this error while reading data from scylla RequestHandler: ip:9042 replied with server error (clustering key cartesian product size 600 is greater than maximum 100), defuncting connection.

like image 505
ROHAN VADJE Avatar asked Jan 27 '20 18:01

ROHAN VADJE


People also ask

How many columns can the clustering key be?

It can be one or more columns. The Clustering Key is responsible for sorting the rows within the partition. It can be zero or more columns. Additionally, we previously had an issue with heartrate_v1, where a pet could only have one heart rate value recorded regardless of the time.

What is primary key and clustering key in Cassandra?

In Cassandra, a primary key consists of one or more partition keys and may include clustering key components. The Apache Cassandra partition key always precedes the clustering key since its hashed value determines which node will store the data. Clustering columns are any fields listed after the partition key.

What is a clustering key and a partition key?

In such a case the first part of the Primary Key is called the Partition Key (pet_chip_id in the above example) and the second part is called the Clustering Key (time). The Partition Key is responsible for data distribution across the nodes.

Do not insert too much data at once Scylla?

Do not insert too much data at once Scylla can handle partitions in the multi-GB range, but that occurs by having a partition that grows over time. When a request arrives in the database, the aperture is considerably smaller. There is a hard limit at 16MB, and nothing bigger than that can arrive at once at the database at any particular time.


1 Answers

This error is returned in order to prevent too large restriction sets from being generated, which may put a strain on your server. If you're aware of the risks and know a reasonable upper bound of the number of restrictions for your queries, you can manually change the maximum in scylla.yaml, e.g. max_clustering_key_restrictions_per_query: 650. Note however, that this option has a warning in its description and it should be acknowledged:

Maximum number of distinct clustering key restrictions per query.
This limit places a bound on the size of IN tuples, especially when multiple
clustering key columns have IN restrictions. Increasing this value can result
in server instability.

In particular, setting this flag above a couple of hundred is risky - 600 should be alright, but at this point you could also consider rephrasing your query, so that they have less values in their IN restrictions - perhaps splitting some queries into multiple smaller ones?

Source from Scylla tracker: https://github.com/scylladb/scylla/pull/4797

like image 88
Piotr Sarna Avatar answered Oct 21 '22 13:10

Piotr Sarna