Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Severe performance drop with MongoDB Change Streams

Tags:

mongodb

I want to get real-time updates about MongoDB database changes in Node.js.

A single MongoDB change stream sends update notifications almost instantly. But when I open multiple (10+) streams, there are massive delays (up to several minutes) between database writes and notification arrival.

That's how I set up a change stream:

let cursor = collection.watch([   {$match: {"fullDocument.room": roomId}}, ]); cursor.stream().on("data", doc => {...}); 

I tried an alternative way to set up a stream, but it's just as slow:

let cursor = collection.aggregate([   {$changeStream: {}},   {$match: {"fullDocument.room": roomId}}, ]); cursor.forEach(doc => {...}); 

An automated process inserts tiny documents into the collection while collecting performance data.

Some additional details:

  • Open stream cursors count: 50
  • Write speed: 100 docs/second (batches of 10 using insertMany)
  • Runtime: 100 seconds
  • Average delay: 7.1 seconds
  • Largest delay: 205 seconds (not a typo, over three minutes)
  • MongoDB version: 3.6.2
  • Cluster setup #1: MongoDB Atlas M10 (3 replica set)
  • Cluster setup #2: DigitalOcean Ubuntu box + single instance mongo cluster in Docker
  • Node.js CPU usage: <1%

Both setups produce the same issue. What could be going on here?

like image 641
aedm Avatar asked Jan 23 '18 22:01

aedm


People also ask

Is MongoDB change stream reliable?

Security: Change streams are safe because users can only establish change streams on collections to which they have been allowed read access. Ease of use: Change streams are well-known – the API syntax uses the established MongoDB drivers and query language and is independent of any specific oplog format.

How does MongoDB change stream work?

Change streams transform a MongoDB database into a real-time database by taking advantage of MongoDB's replication process. They monitor replication in MongoDB, providing an API for external applications that require real-time data without the risk involved in tailing the oplog or the overhead that comes with polling.


1 Answers

The default connection pool size in the Node.js client for MongoDB is 5. Since each change stream cursor opens a new connection, the connection pool needs to be at least as large as the number of cursors.

In version 3.x of the Node Mongo Driver use 'poolSize':

const mongoConnection = await MongoClient.connect(URL, {poolSize: 100}); 

In version 4.x of the Node Mongo Driver use 'minPoolSize' and 'maxPoolSize':

const mongoConnection = await MongoClient.connect(URL, {minPoolSize: 100, maxPoolSize: 1000}); 

(Thanks to MongoDB Inc. for investigating this issue.)

like image 51
aedm Avatar answered Sep 18 '22 13:09

aedm