Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Connection to CosmosDB through Mongo API fails after idle

We have a Scala server which uses the Java MongoDB driver as wrapped by Casbah. Recently, we switched its database over from an actual MongoDB to Azure CosmosDB, using the Mongo API. This is generally working fine, however every once in a while a call to Cosmos fails with a MongoSocketWriteException (stack trace below).

We're creating the client as

import com.mongodb.casbah.Imports._

val mongoUrl = "mongodb://username:[email protected]:10255/?ssl=true&replicaSet=globaldb"

val client = MongoClient(MongoClientURI(mongoUrl))
val collection: MongoCollection = client("mongoDatabase")("mongoCollection")

We tried removing &replicaSet=globaldb from the connection URI as per the suggested workaround for this seemingly similar bug (How to solve MongoError: pool destroyed while connecting to CosmosDB), but it didn't fix the problem.

Stack trace:

com.mongodb.MongoSocketWriteException: Exception sending message
    at com.mongodb.connection.InternalStreamConnection.translateWriteException(InternalStreamConnection.java:462)
    at com.mongodb.connection.InternalStreamConnection.sendMessage(InternalStreamConnection.java:205)
    at com.mongodb.connection.UsageTrackingInternalConnection.sendMessage(UsageTrackingInternalConnection.java:95)
    at com.mongodb.connection.DefaultConnectionPool$PooledConnection.sendMessage(DefaultConnectionPool.java:424)
    at com.mongodb.connection.CommandProtocol.sendMessage(CommandProtocol.java:209)
    at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:111)
    at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
    at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
    at com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173)
    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215)
    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:206)
    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:112)
    at com.mongodb.operation.CountOperation$1.call(CountOperation.java:210)
    at com.mongodb.operation.CountOperation$1.call(CountOperation.java:206)
    at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:230)
    at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:203)
    at com.mongodb.operation.CountOperation.execute(CountOperation.java:206)
    at com.mongodb.operation.CountOperation.execute(CountOperation.java:53)
    at com.mongodb.Mongo.execute(Mongo.java:772)
    at com.mongodb.Mongo$2.execute(Mongo.java:759)
    at com.mongodb.DBCollection.getCount(DBCollection.java:962)
    at com.mongodb.DBCursor.count(DBCursor.java:670)
    at com.mongodb.casbah.MongoCollectionBase.getCount(MongoCollection.scala:496)
    at com.mongodb.casbah.MongoCollectionBase.getCount$(MongoCollection.scala:488)
    at com.mongodb.casbah.MongoCollection.getCount(MongoCollection.scala:1106)
    at com.mongodb.casbah.MongoCollectionBase.count(MongoCollection.scala:897)
    at com.mongodb.casbah.MongoCollectionBase.count$(MongoCollection.scala:894)
    at com.mongodb.casbah.MongoCollection.count(MongoCollection.scala:1106)
    [snip]
Caused by: java.net.SocketException: Broken pipe (Write failed)
    at java.net.SocketOutputStream.socketWrite0(Native Method)
    at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
    at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
    at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431)
    at sun.security.ssl.OutputRecord.write(OutputRecord.java:417)
    at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:876)
    at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:847)
    at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
    at com.mongodb.connection.SocketStream.write(SocketStream.java:75)
    at com.mongodb.connection.InternalStreamConnection.sendMessage(InternalStreamConnection.java:201)
    ... 38 common frames omitted

(Posting this with an answer because I'm hoping the solution will be useful for others, and because I'd welcome any further insight.)

like image 222
Astrid Avatar asked Jan 26 '18 15:01

Astrid


People also ask

Is Cosmosdb compatible with MongoDB?

The Azure Cosmos DB for MongoDB is compatible with MongoDB server version 3.6 by default for new accounts. The supported operators and any limitations or exceptions are listed below. Any client driver that understands these protocols should be able to connect to Azure Cosmos DB for MongoDB.

Which API is best for Cosmos DB?

API for Apache Cassandra This API for Cassandra is wire protocol compatible with native Apache Cassandra. You should consider API for Cassandra if you want to benefit from the elasticity and fully managed nature of Azure Cosmos DB and still use most of the native Apache Cassandra features, tools, and ecosystem.

How do I connect my Cosmosdb connection string?

In an Internet browser, sign in to the Azure portal. In the Azure Cosmos DB blade, select the API. In the left pane of the account blade, click Connection String.

Is Cosmos DB encryption at rest?

Because all user data stored in Azure Cosmos DB is encrypted at rest and in transport, you don't have to take any action. Another way to put this is that encryption at rest is "on" by default. There are no controls to turn it off or on.


1 Answers

The problem went away after we added &maxIdleTimeMS=1500000 to the connection URI in order to set the maximum connection idle time to 25 minutes.

The cause seems to be a timeout of 30 minutes for idle connections on the Azure server, while the default behaviour for Mongo clients is no idle timeout at all. The server does not communicate the fact that it is dropping an idled connection back to the client, so that the next attempt at using it fails with the above error. Setting the maximum connection idle time to a value less than 30 minutes makes our server close idle connections before the Azure server kills them. Some sort of keep-alive or check before using a connection would probably also be possible.

I haven't actually been able to find any documentation about this or other references to this problem for CosmosDB, although it may be caused by or related to the 30 minute idle timeout for TCP connections for Azure Internal Load Balancers (see e.g. https://feedback.azure.com/forums/217313-networking/suggestions/18823588-increase-idle-timeout-on-internal-load-balancers-t).

like image 129
Astrid Avatar answered Oct 27 '22 09:10

Astrid