We are planning to upgrade our cluster which currently runs on 2.0.9 to 2.2.6. According to the documentation and some blogs people upgrade cassandra inplace ie remove a node from ring upgrade it and add it back again. We are skeptical of following this approach as things can go wrong(This is a high transaction database with a good number of QPS).
So we were planning to add a new datacenter to the cluster which shall have upgraded cassandra version(2.2). So the setup shall have two datacenter one old(2.0.9) and the other new (2.2.6)
This datacenter is just a backup. When the datacenter becomes stable we shall change the client connection to this datacenter and if it plays well then we shall go with this datacenter and close the old datacenter or else we can fall back to the old datacenter and debug what went wrong.
Is this process feasible enough or should we go for in place upgrade?
Can two cassandra version(2.0 and 2.2) exist across a datacenter.
Is there a downfall in this approach?
Open cqlsh and type show VERSION . This gives all the versions of cqlsh, DSE, Cassandra etc.
As we said earlier, each instance of Cassandra has evolved to contain 256 virtual nodes. The Cassandra server runs core processes. For example, processes like spreading replicas around nodes or routing requests.
Can two cassandra version(2.0 and 2.2) exist across a datacenter.
No, they cannot.
Is this process feasible enough or should we go for in place upgrade?
You will need to perform an in-place upgrade. This is because Cassandra cannot stream across versions. Performing an in-place upgrade allows the new version to read the SSTables from the old version.
Is there a downfall in this approach?
As I mentioned, you will not be able to stream data from your existing nodes to the new 2.2 DC. So bootstrapping, rebuilding, and repairing are all out of the question.
The other issue you have, is that 2.2.6 is not "upgrade compatible" with 2.0.9. From this DataStax doc: Apache Cassandra versions requiring intermediate upgrades...
Apache Cassandra 2.2.x restrictions
You will first have to upgrade your entire cluster to Cassandra 2.1. Once the upgrade to 2.1 is complete, then you can upgrade your nodes to 2.2.6.
Cassandra is a master-less distributed datastore. For Cassandra there's no such thing as a "backup" datacenter. If you're going to add another DC running 2.2, you're opting-in for a mixed version cluster setup, just as you would by upgrading nodes individually. The only advantage I see is that performance issues should be less likely due to the added nodes. However, adding another DC will make your cluster setup more complex and may introduce issues that you don't know about yet, but won't have anything to do with running different versions. How would you bootstrap the new DC? How will taking down the old DC effect performance? The operational impact will be much bigger with this approach compared to updating individual nodes..
If you really don't want to do rolling upgrades, I'd suggest do setup the second DC as a separate cluster, import a backup and do some (load) testing. Also change your code to write into both clusters and eventually switch to the new one if you're satisfied. If you don't want to spend so much effort, just do the rolling upgrade.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With