Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can MongoDB java driver determine if replica set is in the process of automatic failover?

Our application is build upon mongodb replica set. I'd like to catch all exceptions thrown among the time frame when replica set is in process of automatic failover. I will make application retry or wait for failover completes. So that the failover won't influence user. I found document describing the behavior of java driver here: https://jira.mongodb.org/browse/DOCS-581

I write a test program to find all possible exceptions, they are all MongoException but with different message:

  1. MongoException.Network: "Read operation to server /10.11.0.121:27017 failed on database test"
  2. MongoException: "can't find a master"
  3. MongoException: "not talking to master and retries used up"
  4. MongoException: "No replica set members available in [ here is replica set status ] for { "mode" : "primary"}"
  5. Maybe more...

I'm confused and not sure if it is safe to determine by error message. Also I don't want to catch all MongoException. Any suggestion?

Thanks

like image 323
William Bao Avatar asked Oct 22 '22 04:10

William Bao


2 Answers

I am now of the opinion that Mongo in Java is particularly weak in this regards. I don't think your strategy of interpreting the error codes scales well or will survive driver evolution. This is, of course, opinion.

The good news is that the Mongo driver provides a way get the status of a ReplicaSet: http://api.mongodb.org/java/2.11.1/com/mongodb/ReplicaSetStatus.html. You can use it directly to figure out whether there is a Master visible to your application. If that is all you want to know, the http://api.mongodb.org/java/2.11.1/com/mongodb/Mongo.html#getReplicaSetStatus() is all you need. Grab that kid and check for a not-null master and you are on your way.

ReplicaSetStatus rss = mongo.getReplicaSetStatus();
boolean driverInFailover = rss.getMaster() == null;

If what you really need is to figure out if the ReplSet is dead, read-only, or read-write, this gets more difficult. Here is the code that kind-of works for me. I hate it.

@Override
public ReplSetStatus getReplSetStatus() {
    ReplSetStatus rss = ReplSetStatus.DOWN;
    MongoClient freshClient = null;
    try {
        if ( mongo != null ) {
            ReplicaSetStatus replicaSetStatus = mongo.getReplicaSetStatus();
            if ( replicaSetStatus != null ) {
                if ( replicaSetStatus.getMaster() != null ) {
                    rss = ReplSetStatus.ReadWrite;
                } else {
                    /*
                     * When mongo.getReplicaSetStatus().getMaster() returns null, it takes a a
                     * fresh client to assert whether the ReplSet is read-only or completely
                     * down. I freaking hate this, but take it up with 10gen.
                     */
                    freshClient = new MongoClient( mongo.getAllAddress(), mongo.getMongoClientOptions() );
                    replicaSetStatus = freshClient.getReplicaSetStatus();
                    if ( replicaSetStatus != null ) {
                        rss = replicaSetStatus.getMaster() != null ? ReplSetStatus.ReadWrite : ReplSetStatus.ReadOnly;
                    } else {
                        log.warn( "freshClient.getReplicaSetStatus() is null" );
                    }
                }
            } else {
                log.warn( "mongo.getReplicaSetStatus() returned null" );
            }
        } else {
            throw new IllegalStateException( "mongo is null?!?" );
        }
    } catch ( Throwable t ) {
        log.error( "Ingore unexpected error", t );
    } finally {
        if ( freshClient != null ) {
            freshClient.close();
        }
    }
    log.debug( "getReplSetStatus(): {}", rss );
    return rss;
}

I hate it because it doesn't follow the Mongo Java Driver convention of your application only needs a single Mongo and through this singleton you connect to the rest of the Mongo data structures (DB, Collection, etc). I have only been able to observe this working by new'ing up a second Mongo during the check so that I can rely upon the ReplicaSetStatus null check to discriminate between "ReplSet-DOWN" and "read-only".

What is really needed in this driver is some way to ask direct questions of the Mongo to see if the ReplSet can be expected at this moment to support each of the WriteConcerns or ReadPreferences. Something like...

/**
 * @return true if current state of Client can support readPreference, false otherwise
 */
boolean mongo.canDoRead( ReadPreference readPreference )

/**
 * @return true if current state of Client can support writeConcern; false otherwise
 */
boolean mongo.canDoWrite( WriteConcern writeConcern ) 

This makes sense to me because it acknowledges the fact that the ReplSet may have been great when the Mongo was created, but conditions right now mean that Read or Write operations of a specific type may fail due to changing conditions.

In any event, maybe http://api.mongodb.org/java/2.11.1/com/mongodb/ReplicaSetStatus.html gets you what you need.

like image 162
Bob Kuhar Avatar answered Oct 31 '22 16:10

Bob Kuhar


When Mongo is failing over, there are no nodes in a PRIMARY state. You can just get the replica set status via the replSetGetStatus command and look for a master node. If you don't find one, you can assume that the cluster is in a failover transition state, and can retry as desired, checking the replica set status on each failed connection.

like image 42
Chris Heald Avatar answered Oct 31 '22 17:10

Chris Heald