I have a 5 node Cassandra 2.0.7 cluster, each node has 4 HDDs. Recently one of these HDDs on node3 had failed and was replaced by a new shiny empty drive. After the replacement cassandra on this node was unable to start with this exception:
INFO [main] 2014-06-02 12:45:17,232 ColumnFamilyStore.java (line 254) Initializing system.paxos
INFO [main] 2014-06-02 12:45:17,236 ColumnFamilyStore.java (line 254) Initializing system.schema_columns
INFO [SSTableBatchOpen:1] 2014-06-02 12:45:17,237 SSTableReader.java (line 223) Opening /mnt/disk2/cassandra/system/schema_columns/system-schema_columns-jb-310 (25418 bytes)
INFO [main] 2014-06-02 12:45:17,241 ColumnFamilyStore.java (line 254) Initializing system.IndexInfo
INFO [main] 2014-06-02 12:45:17,245 ColumnFamilyStore.java (line 254) Initializing system.peers
INFO [SSTableBatchOpen:1] 2014-06-02 12:45:17,246 SSTableReader.java (line 223) Opening /mnt/disk3/cassandra/system/peers/system-peers-jb-25 (20411 bytes)
INFO [main] 2014-06-02 12:45:17,253 ColumnFamilyStore.java (line 254) Initializing system.local
INFO [SSTableBatchOpen:1] 2014-06-02 12:45:17,254 SSTableReader.java (line 223) Opening /mnt/disk3/cassandra/system/local/system-local-jb-35 (80 bytes)
INFO [SSTableBatchOpen:2] 2014-06-02 12:45:17,254 SSTableReader.java (line 223) Opening /mnt/disk3/cassandra/system/local/system-local-jb-34 (80 bytes)
ERROR [main] 2014-06-02 12:45:17,361 CassandraDaemon.java (line 237) Fatal exception during initialization
org.apache.cassandra.exceptions.ConfigurationException: Found system keyspace files, but they couldn't be loaded!
at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:532)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:233)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:462)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:552)
Because of cassandra node being unable to start, I cannot use nodetool repair.
The only way I see to recover the node is to remove all data and bootstrap it from nearly bare metal. Is there a shorter way to recover in a typical HDD failure scenario?
If a node is down or unavailable during a write request, Cassandra handles this with the Hinted Handoff -- a mechanism where the coordinator node responsible for managing a write request will store hints (write mutations) and replay it to the replica when it comes back online.
Server will be down for more than 4 hours. Important to note, but by default each node can store hints for up to 3 hours. Or Cassandra will take care itself to replicate the data updated, created, deleted during these 4 hours. Maybe if you could limit the outage window to less than 3 hours.
A node in Cassandra contains the actual data and it's information such that location, data center information, etc. A node contains the data such that keyspaces, tables, the schema of data, etc. you can perform operations such that read, write, delete data, etc. on a node.
Fixed the issue by these steps:
physically removed files related to system keyspace: cassandra was able to start and recreated it, but without any metadata about other keyspaces.
ran nodetool resetlocalschema, which synchronized keyspace schema from other nodes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With