Why are mongodb queries to a localhost instance of mongo so much faster than to a cloud instance?

Tags:

I'm using this code to run the tests outlined in this blog post.

(For posterity, relevant code pasted at the bottom).

What I've found is that if I run these experiments with a local instance of Mongo (in my case, using docker)

docker run -d -p 27017:27017 -v ~/data:/data/db mongo

Then I get pretty good performance, similar results as outlined in the blog post:

finished populating the database with 10000 users
default_query: 277.986ms
query_with_index: 262.886ms
query_with_select: 157.327ms
query_with_select_index: 136.965ms
lean_query: 58.678ms
lean_with_index: 65.777ms
lean_with_select: 23.039ms
lean_select_index: 21.902ms
[nodemon] clean exit - waiting

However, when I switch do using a cloud instance of Mongo, in my case an Atlas sandbox instance, with the following configuration:

CLUSTER TIER
M0 Sandbox (General)
REGION
GCP / Iowa (us-central1)
TYPE
Replica Set - 3 nodes
LINKED STITCH APP
None Linked

(Note that I'm based in Melbourne, Australia).

Then I get much worse performance.

adding 10000 users to the database
finished populating the database with 10000 users
default_query: 8279.730ms
query_with_index: 8791.286ms
query_with_select: 5234.338ms
query_with_select_index: 4933.209ms
lean_query: 13489.728ms
lean_with_index: 10854.134ms
lean_with_select: 4906.428ms
lean_select_index: 4710.345ms

I get that obviously there's going to be some round trip overhead between my computer and the mongo instance, but I would expect that to add 200ms max.

It seems that that round trip time must be being added multiple times, or something completely else that I'm not aware of - can someone explain just what it is that would cause this to blow out?

A good answer might involve doing an explain plan, and explaining that in terms of network latency.

Tests against different Atlas instances - For those suggesting the issue is that I'm using a Sandbox instance of Atlas - here is the results for a M20 and M30 instances:

BACKUPS
Active
CLUSTER TIER
M20 (General)
REGION
GCP / Iowa (us-central1)
TYPE
Replica Set - 3 nodes
LINKED STITCH APP
None Linked
BI CONNECTOR
Disabled

adding 10000 users to the database
finished populating the database with 10000 users
default_query: 9015.309ms
query_with_index: 8779.388ms
query_with_select: 4568.794ms
query_with_select_index: 4696.811ms
lean_query: 7694.718ms
lean_with_index: 7886.828ms
lean_with_select: 3654.518ms
lean_select_index: 5014.867ms

BACKUPS
Active
CLUSTER TIER
M30 (General)
REGION
GCP / Iowa (us-central1)
TYPE
Replica Set - 3 nodes
LINKED STITCH APP
None Linked
BI CONNECTOR
Disabled

adding 10000 users to the database
finished populating the database with 10000 users
default_query: 8268.799ms
query_with_index: 8933.502ms
query_with_select: 4740.234ms
query_with_select_index: 5457.168ms
lean_query: 9296.202ms
lean_with_index: 9111.568ms
lean_with_select: 4385.125ms
lean_select_index: 4812.982ms

These really don't show any significant difference (be aware than any difference may just be network noise).

Tests colocating the Mongo client and the mongo database instance

I created a docker container and ran it on Google's Cloud Run, in the same region (US Central1), the results are:

2019-12-30 11:46:06.814 AEDTfinished populating the database with 10000 users
2019-12-30 11:46:07.885 AEDTdefault_query: 1071.233ms
2019-12-30 11:46:08.917 AEDTquery_with_index: 1031.952ms
2019-12-30 11:46:09.375 AEDTquery_with_select: 457.659ms
2019-12-30 11:46:09.657 AEDTquery_with_select_index: 281.678ms
2019-12-30 11:46:10.281 AEDTlean_query: 623.417ms
2019-12-30 11:46:10.961 AEDTlean_with_index: 680.622ms
2019-12-30 11:46:11.056 AEDTlean_with_select: 94.722ms
2019-12-30 11:46:11.148 AEDTlean_select_index: 91.984ms

So while this doesn't give results as fast as running on my own machine - it does show that colocating the client and the database gives a very large performance improvement.

So the question again is - why is the improvement ~7000ms?

The test code:

(async () => {
  try {
    await mongoose.connect('mongodb://localhost:27017/perftest', {
      useNewUrlParser: true,
      useCreateIndex: true
    })

    await init()

    // const query = { age: { $gt: 22 } }
    const query = { favoriteFruit: 'potato' }

    console.time('default_query')
    await User.find(query)
    console.timeEnd('default_query')

    console.time('query_with_index')
    await UserWithIndex.find(query)
    console.timeEnd('query_with_index')

    console.time('query_with_select')
    await User.find(query)
      .select({ name: 1, _id: 1, age: 1, email: 1 })
    console.timeEnd('query_with_select')

    console.time('query_with_select_index')
    await UserWithIndex.find(query)
      .select({ name: 1, _id: 1, age: 1, email: 1 })
    console.timeEnd('query_with_select_index')

    console.time('lean_query')
    await User.find(query).lean()
    console.timeEnd('lean_query')

    console.time('lean_with_index')
    await UserWithIndex.find(query).lean()
    console.timeEnd('lean_with_index')

    console.time('lean_with_select')
    await User.find(query)
      .select({ name: 1, _id: 1, age: 1, email: 1 })
      .lean()
    console.timeEnd('lean_with_select')

    console.time('lean_select_index')
    await UserWithIndex.find(query)
      .select({ name: 1, _id: 1, age: 1, email: 1 })
      .lean()
    console.timeEnd('lean_select_index')
    process.exit(0)
  } catch (err) {
    console.error(err)
  }
})()

480

asked Dec 18 '19 00:12

dwjohnston

1 Answers

My best guess is that you're dealing with slow network throughput between your local machine and Atlas (something I've experienced myself this week - hence how I found this post!)

Looking at your local query performance:

default_query: 277.986ms

query_with_index: 262.886ms

The query with index isn't noticeably any faster than the one without. For an indexed query to take 262ms in a Node app with a local DB probably means that either:

The index isn't being used properly OR more likely...
You're returning quite a few results in the query. If the query returns say 3,000 results and each result is 1KB, that's 3MB of JSON data that your app needs to handle.

I've got a 150Mbit/s internet connection and yet my throughput to Atlas (M2 shared tier, if that makes a difference) fluctuates between around 1Mbit/s to 6Mbit/s.

On localhost I have a Mongo query that returns 2,400 results for a total of 1.7MB of JSON data. The roundtrip time for that query in my Node app (using console.time() like you did) connected to Mongo on the same local dev machine is ~150ms. But when connecting that local app to Atlas the query takes 2,400ms to 3,400ms to return. When I profiled the query on Atlas it only took 2ms to execute, so the query itself is really fast, it's apparently the data transfer that's slow.

Based on these results, I have a feeling that Atlas perhaps throttles throughput over the public internet (or just doesn't bother optimizing for it in their network) because 99% of apps are colocated in the same network region as their Atlas DB. That's the reason why they ask you to pick not just AWS, Azure, etc but your specific network region when creating a cluster.

UPDATE: I just ran a few Amazon EC2 speed tests for my network region (us-east-1) using a 3rd-party service and the average download speed was 4.5Mbit/s for smaller files (1KB to 128KB) and 41Mbit/s for larger files (256KB to 10MB). So the primary issue may be generally slow throughput on the EC2 instances that Atlas clusters run on rather than any throttling by Atlas, or perhaps a combination of both.

answered Sep 27 '22 22:09

Dave Koo

Related questions
                            
                                NestJS and TypeORM issue with or without tsconfig target es5
                            
                                Node module-alias Error: Cannot find module '@src/utils/constants'
                            
                                How to make socket.io work properly with pm2 cluster mode?
                            
                                Socket emitting event multiple times
                            
                                Axios - Uncaught (in promise) Error: Request failed with status code 500
                            
                                Accept POST request from other website to Angular Application
                            
                                how to break loop in webpack hook
                            
                                SOAPUI and Node.JS/Request- AccessException
                            
                                Django vs Node(Express) vs Flask for RESTful API with high security and real-time
                            
                                Sequelize Postgres - How to use ON CONFLICT for unique?
                            
                                TokenError: Bad Request; Google OAuth2; Passport.js on Node.js; Able to console.log data, however delivers error
                            
                                child_process methods are slow on Electron
                            
                                How to test small snippets of javascript code without a web browser?
                            
                                Read timed out. error while sending a POST request to a node.js API
                            
                                Are there Node.js examples for how to connect to AWS Aurora Serverless PostgreSQL via Lambda
                            
                                Difference between lib and es folder
                            
                                Nest.js: how to override providers in an imported module?
                            
                                Angular/Node authenticate User
                            
                                Implement Github Authentication and Authorization in a CLI App
                            
                                Node + react js increse upload file size in post request

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why are mongodb queries to a localhost instance of mongo so much faster than to a cloud instance?

Tags:

node.js

mongodb

mongoose

latency

dwjohnston

People also ask

1 Answers

Dave Koo

Recent Activity

Donate For Us