Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating new table with cqlsh on existing keyspace: Column family ID mismatch

Tags:

Houston, we have a problem.

Trying to create a new table with cqlsh on an existing Cassandra (v2.1.3) keyspace results in:

ServerError:  <ErrorMessage code=0000 [Server error] message="java.lang.RuntimeException: java.util.concurrent.ExecutionException:      java.lang.RuntimeException:               org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found e8c03790-c952-11e4-a753-5981ea73cd7c; expected e8b14370-c952-11e4-a844-8f10bfb9c386)"> 

After the first create attempt, trying once more will result in:

AlreadyExists: Table 'ks.metrics' already exists

But retrieving the list of existing tables for the keyspace desc tables; will not report the new table.

The issue seems related to Cassandra-8387 except that there's only one client trying to create the table: cqlsh

We do have a bunch of Spark jobs that will create the keyspaces and tables at startup, potentially doing this in parallel. Would this render the keyspace corrupt?

Creating a new keyspace and adding a table to it works as expected.

Any ideas?

UPDATE

Found a workaround: issue a repair on the keyspace and the tables will appear (desc tables) and are also functional.

like image 260
maasg Avatar asked Mar 13 '15 11:03

maasg


People also ask

How to define column family in Cassandra?

A Cassandra column family consists of a collection of ordered columns in rows which represent a structured version of the stored data. The keyspace holds these Cassandra column families and each keyspace has at least one column family.

What is the relationship between a column family and a CQL table?

To answer the original question you posed: a column family and a table are the same thing. The name "column family" was used in the older Thrift API. The name "table" is used in the newer CQL API.

What is keyspace in Cassandra?

What is a Keyspace in Cassandra? A keyspace is a data container in Cassandra, similar to a database in relational database management systems (RDMBS). A cluster has one keyspace per application, as many as needed, depending on requirements and system usage.


1 Answers

Short answer: They have a race condition, which they think they resolved in 1.1.8...


Long answer:

I get that error all the time on one of my clusters. I have test machines that have really slow hard drives and creating one or two tables is enough to get the error when I have 4 nodes on two separate computers.

Below I have a copy of the stack trace from my Cassandra 3.7 installation. Although your version was 2.1.3, I would be surprised that this part of the code changed that much.

As we can see, the exception happens in the validateCompatibility() function. This requires that the new and old versions of the MetaData have these equal:

  • ksName (keyspace name)
  • cfName (columnfamily name)
  • cfId (columnfamily UUID)
  • flags (isSuper, isCounter, isDense, isCompound)
  • comparator (key sorting comparator)

If any one of these values do not match between the old and new meta data, then the process raises an exception. In our case, the cfId values are different.

Going up the stack, we have the apply() which calls validateCompatibility() immediately.

Next we have updateTable(). Similarly, it calls apply() nearly immediately. First it calls the getCFMetaData() to retrieve the current column family data ("old") that is going to be compared against the new data.

Next we see updateKeyspace(). That function calculates a diff to know what changed. Then it saves that in each type of data. Table is 2nd after Type...

Before that they have the mergeSchema() which calculates what changed at the Keyspace level. It then drops keyspaces that were deleted and generate new keyspaces for those that were updated (and for new keyspaces). Finally, they loop over the new keyspaces calling updateKeyspace() for each one of them.

Next in the stack we see an interesting function: mergeSchemaAndAnnounceVersion(). This one will update the version once the keyspaces were updated in memory and on disk. The version of the schema includes that cfID that is not compatible and thus generates the exception. The Announce part is to send a gossip message to the other nodes about the fact that this node now knows of the new version of a certain schema.

Next we see something called MigrationTask. This is the message used to migrate changes between Cassandra nodes. The message payload is a collection of mutations (those handled by the mergeSchema() function.)

The rest of the stack just shows run() functions that are various types of functions used to handle messages.

In my case, for me the problem gets resolved a little later and all is well. I have nothing to do for the schema to finally get in sync. as expected. However, it prevents me from creating all my tables in one go. So, my take looking at this is that the migration messages do not arrive in the expected order. There must be a timeout which is handled by resending the event and that generates the mix-up.

So, lets look at the code sending the message in the first place, you see that one in the MigrationManager. Here we have a MIGRATION_DELAY_IN_MS parameter in link with an old issue, Schema push/pull race, which was to avoid a race condition. Well... there you go. So they are aware that there is a possible race condition and to try to avoid it, they added a little delay there. One part of that fix includes a version check. If the versions are already equal, avoid the update altogether (i.e. ignore that gossip).

if (Schema.instance.getVersion().equals(currentVersion)) {     logger.debug("not submitting migration task for {} because our versions match", endpoint);     return; } 

The delay we are talking about is one minute:

public static final int MIGRATION_DELAY_IN_MS = 60000; 

One would think that one whole minute would suffice, but somehow I still get the error all the time.

The fact is that their code does not expect multiple changes happening one after the other including large delays like I have. So if I were to create one table, and then do other things, I'd be just fine. On the other hand, when I want to create 20 tables in a row on those slow machines, the gossiping message from a previous schema change arrives late (i.e. after the new CREATE TABLE command arrived to that node.) That's when I get that error. The worst part, I guess, is that it is a spurious error (i.e. it is telling me that the gossip was later, and not that my schema is invalid and the schema in the gossip message is an old one.)

org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found 122a2d20-9e13-11e6-b830-55bace508971; expected 1213bef0-9e     at org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:790) ~[apache-cassandra-3.9.jar:3.9]     at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:750) ~[apache-cassandra-3.9.jar:3.9]     at org.apache.cassandra.config.Schema.updateTable(Schema.java:661) ~[apache-cassandra-3.9.jar:3.9]     at org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1350) ~[apache-cassandra-3.9.jar:3.9]     at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1306) ~[apache-cassandra-3.9.jar:3.9]     at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1256) ~[apache-cassandra-3.9.jar:3.9]     at org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:92) ~[apache-cassandra-3.9.jar:3.9]     at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53) [apache-cassandra-3.9.jar:3.9]     at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) [apache-cassandra-3.9.jar:3.9]     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_111]     at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111]     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111]     at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111] 
like image 109
Alexis Wilke Avatar answered Oct 14 '22 05:10

Alexis Wilke