Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongos routing with ReadPreference=NEAREST

I'm having trouble diagnosing an issue where my Java application's requests to MongoDB are not getting routed to the Nearest replica, and I hope someone can help. Let me start by explaining my configuration.

The Configuration:

I am running a MongoDB instance in production that is a Sharded ReplicaSet. It is currently only a single shard (it hasn't gotten big enough yet to require a split). This single shard is backed by a 3-node replica set. 2 nodes of the replica set live in our primary data center. The 3rd node lives in our secondary datacenter, and is prohibited from becoming the Master node.

We run our production application simultaneously in both data centers, however the instance in our secondary data center operates in "read-only" mode and never writes data into MongoDB. It only serves client requests for reads of existing data. The objective of this configuration is to ensure that if our primary datacenter goes down, we can still serve client read traffic.

We don't want to waste all of this hardware in our secondary datacenter, so even in happy times we actively load balance a portion of our read-only traffic to the instance of our application running in the secondary datacenter. This application instance is configured with readPreference=NEAREST and is pointed at a mongos instance running on localhost (version 2.6.7). The mongos instance is obviously configured to point at our 3-node replica set.

From a mongos:

mongos> sh.status()
--- Sharding Status --- 
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("52a8932af72e9bf3caad17b5")
}
shards:
{  "_id" : "shard1",  "host" : "shard1/failover1.com:27028,primary1.com:27028,primary2.com:27028" }
databases:
{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
{  "_id" : "test",  "partitioned" : false,  "primary" : "shard1" }
{  "_id" : "MyApplicationData",  "partitioned" : false,  "primary" : "shard1" }

From the failover node of the replicaset:

shard1:SECONDARY> rs.status()
{
"set" : "shard1",
"date" : ISODate("2015-09-03T13:26:18Z"),
"myState" : 2,
"syncingTo" : "primary1.com:27028",
"members" : [
{
    "_id" : 3,
    "name" : "primary1.com:27028",
    "health" : 1,
    "state" : 1,
    "stateStr" : "PRIMARY",
    "uptime" : 674841,
    "optime" : Timestamp(1441286776, 2),
    "optimeDate" : ISODate("2015-09-03T13:26:16Z"),
    "lastHeartbeat" : ISODate("2015-09-03T13:26:16Z"),
    "lastHeartbeatRecv" : ISODate("2015-09-03T13:26:18Z"),
    "pingMs" : 49,
    "electionTime" : Timestamp(1433952764, 1),
    "electionDate" : ISODate("2015-06-10T16:12:44Z")
},
{
    "_id" : 4,
    "name" : "primary2.com:27028",
    "health" : 1,
    "state" : 2,
    "stateStr" : "SECONDARY",
    "uptime" : 674846,
    "optime" : Timestamp(1441286777, 4),
    "optimeDate" : ISODate("2015-09-03T13:26:17Z"),
    "lastHeartbeat" : ISODate("2015-09-03T13:26:18Z"),
    "lastHeartbeatRecv" : ISODate("2015-09-03T13:26:18Z"),
    "pingMs" : 53,
    "syncingTo" : "primary1.com:27028"
},
{
    "_id" : 5,
    "name" : "failover1.com:27028",
    "health" : 1,
    "state" : 2,
    "stateStr" : "SECONDARY",
    "uptime" : 8629159,
    "optime" : Timestamp(1441286778, 1),
    "optimeDate" : ISODate("2015-09-03T13:26:18Z"),
    "self" : true
}
],
"ok" : 1
}


shard1:SECONDARY> rs.conf()
{
    "_id" : "shard1",
    "version" : 15,
    "members" : [
    {
        "_id" : 3,
        "host" : "primary1.com:27028",
        "tags" : {
            "dc" : "primary"
        }
    },
    {
        "_id" : 4,
        "host" : "primary2.com:27028",
        "tags" : {
            "dc" : "primary"
        }
    },
    {
        "_id" : 5,
        "host" : "failover1.com:27028",
        "priority" : 0,
        "tags" : {
            "dc" : "failover"
        }
    }
    ],
    "settings" : {
        "getLastErrorModes" : {"ACKNOWLEDGED" : {}}
    }
}

The Problem:

The problem is that requests which hit this mongos in our secondary datacenter seem to be getting routed to a replica running in our primary datacenter, not the nearest node, which is running in the secondary datacenter. This incurs a significant amount of network latency and results in bad read performance.

My understanding is that the mongos is deciding which node in the replica set to route the request to, and it's supposed to honor the ReadPreference from my java driver's request. Is there a command I can run in the mongos shell to see the status of the replica set, including ping times to nodes? Or some way to see logging of incoming requests which indicates the node in the replicaSet that was chosen and why? Any advice at all on how to diagnose the root cause of my issue?

like image 650
skelly Avatar asked Sep 02 '15 21:09

skelly


People also ask

What is mongos router?

The query router, also known as mongos process, acts as the interface and entry point to our MongoDB cluster. Applications connect to it instead of connecting to the underlying shards and replica sets; mongos executes queries, gathers results, and passes them to our application.

What is the difference between mongod and mongos?

Here, Mongod is the server component. You start it, it runs, that's it. By definition we also call it the primary daemon process for the MongoDB database which handles data requests, manages data access, and performs background management operations. Whereas Mongo is a default command line client.

What is the main purpose of the mongos process?

The mongos process, shown in the center of figure 1, is a router that directs all reads, writes, and commands to the appropriate shard. In this way, mongos provides clients with a single point of contact with the cluster, which is what enables a sharded cluster to present the same interface as an unsharded one.

What is a routing service for MongoDB shard configuration?

Mongos. Mongos act as the query router providing a stable interface between the application and the sharded cluster. This MongoDB instance is responsible for routing the client requests to the correct shard.


2 Answers

While configuring read preference, when ReadPreference = NEAREST the system does not look for minimum network latency as it may decide primary as the nearest, if the network connection is proper. However, the nearest read mode, when combined with a tag set, selects the matching member with the lowest network latency. Even nearest may be any of primary or secondary. Behaviour of mongos when preferences configured , and in terms of network latency is not so clearly explained in the official docs.

http://docs.mongodb.org/manual/core/read-preference/#replica-set-read-preference-tag-sets

hope this helps

like image 152
TharunRaja Avatar answered Oct 25 '22 06:10

TharunRaja


If I start mongos with flag -vvvv (4x verbose) then I am presented with request routing information in the log files, including information about the read preference used and the host to which requests were routed. for example:

2015-09-10T17:17:28.020+0000 [conn3] dbclient_rs say 
using secondary or tagged node selection in shard1, 
read pref is { pref: "nearest", tags: [ {} ] } 
    (primary : primary1.com:27028, 
    lastTagged : failover1.com:27028)
like image 39
skelly Avatar answered Oct 25 '22 08:10

skelly