Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it OK to query a MongoDB multiple times per request?

Tags:

Coming from an RDBMS background, I was always under the impression "Try as hard as you can to use one query, assuming it's efficient," meaning that it's costly for every request you make to the database. When it comes to MongoDB, it seems like this might not be possible because you can't join tables.

I understand that it's not supposed to be relational, but they're also pushing it for purposes like blogs, forums, and things I'd find an RDBMS easier to approach with.

There are some hang ups I've had trying to understand the efficiency of MongoDB or NoSQL in general. If I wanted to get all "posts" related to certain users (as if they were grouped)... using MySQL I'd probably do some joins and get it with that.

In MongoDB, assuming I need the collections separate, would it be efficient to use a large $in: ['user1', 'user2', 'user3', 'user4', ...] ?

Does that method get slow after a while? If I include 1000 users? And if I needed to get that list of posts related to users X,Y,Z, would it be efficient and/or fast using MongoDB to do:

  • Get users array
  • Get Posts IN users array

2 queries for one request. Is that bad practice in NoSQL?

like image 713
Matt Kenefick Avatar asked Mar 01 '11 16:03

Matt Kenefick


People also ask

Can we run query efficiently in MongoDB?

Performance. Because the index contains all fields required by the query, MongoDB can both match the query conditions and return the results using only the index. Querying only the index can be much faster than querying documents outside of the index.

Can a single model be used to query more than one MongoDB collection?

You can query multiple collections using the $lookup aggregation. I wrote an example here on the forum. Also, the Query Results Data Source can absolutely join results from several MongoDB queries. It doesn't matter which source database you used.

What are the limitations of MongoDB?

The maximum BSON document size is 16 megabytes. The maximum document size helps ensure that a single document cannot use excessive amount of RAM or, during transmission, excessive amount of bandwidth. To store documents larger than the maximum size, MongoDB provides the GridFS API.

Is MongoDB good for complex queries?

MongoDB is designed to make data easy to access, and rarely to require joins or transactions, but when you need to do complex querying, it's more than up to the task. The MongoDB Query API allows you to query deep into documents, and even perform complex analytics pipelines with just a few lines of declarative code.


1 Answers

To answer the Q about $in....

I did some performance tests with the following scenario:

~24 million docs in a collection
Lookup 1 million of those documents based on a key (indexed)
Using CSharp driver from .NET

Results:
Querying 1 at a time, single threaded : 109s
Querying 1 at a time, multi threaded : 48s
Querying 100K at a time using $in, single threaded=20s
Querying 100K at a time using $in, multi threaded=9s

So noticeably better performance using a large $in (restricted to max query size).

Update: Following on from comments below about how $in performs with different chunk sizes (queries multi-threaded):

Querying 10 at a time (100000 batches) = 8.8s
Querying 100 at a time (10000 batches) = 4.32s
Querying 1000 at a time (1000 batches) = 4.31s
Querying 10000 at a time (100 batches) = 8.4s
Querying 100000 at a time (10 batches) = 9s (per original results above)

So there does look to be a sweet-spot for how many values to batch up in to an $in clause vs. the number of round trips

like image 173
AdaTheDev Avatar answered Oct 12 '22 15:10

AdaTheDev