Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB {aggregation $match} vs {find} speed

I have a mongoDB collection with millions of rows and I'm trying to optimize my queries. I'm currently using the aggregation framework to retrieve data and group them as I want. My typical aggregation query is something like : $match > $group > $ group > $project

However, I noticed that the last parts only take a few ms, the beginning is the slowest.

I tried to perform a query with only the $match filter, and then to perform the same query with collection.find. The aggregation query takes ~80ms while the find query takes 0 or 1ms.

I have indexes on pretty much each field so I guess this isn't the problem. Any idea on what could go wrong ? Or is it just a "normal" drawback of the aggregation framework ?

I could use find queries instead of aggregation queries, however I would have to perform a lot of processing after the request and this process can be done quickly with $group etc. so I would rather keep the aggregation framework.

Thanks,

EDIT :

Here is my criteria :

{     "action" : "click",     "timestamp" : {             "$gt" : ISODate("2015-01-01T00:00:00Z"),             "$lt" : ISODate("2015-02-011T00:00:00Z")     },     "itemId" : "5" } 
like image 696
Owumaro Avatar asked Feb 06 '15 11:02

Owumaro


People also ask

Is aggregate faster than find in MongoDB?

Without seeing your data and your query it is difficult to answer why aggregate+sort is faster than find+sort. A well indexed(Indexing that suits your query) data will always yield faster results on your find query.

Which one is faster aggregate or find?

The aggregation query takes ~80ms while the find query takes 0 or 1ms.

What are the differences between using aggregate () and find () in MongoDB?

With aggregate + $match, you get a big monolithic BSON containing all matching documents. With find, you get a cursor to all matching documents. Then you can get each document one by one.

Which aggregation method is preferred for use by MongoDB?

The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method for data aggregation in MongoDB. The aggregation pipeline can operate on a sharded collection. The aggregation pipeline can use indexes to improve its performance during some of its stages.


1 Answers

The main purpose of the aggregation framework is to ease the query of a big number of entries and generate a low number of results that hold value to you.

As you have said, you can also use multiple find queries, but remember that you can not create new fields with find queries. On the other hand, the $group stage allows you to define your new fields.

If you would like to achieve the functionality of the aggregation framework, you would most likely have to run an initial find (or chain several ones), pull that information and further manipulate it with a programming language.

The aggregation pipeline might seem to take longer, but at least you know you only have to take into account the performance of one system - MongoDB engine.

Whereas, when it comes to manipulating the data returned from a find query, you would most likely have to further manipulate the data with a programming language, thus increasing the complexity depending on the intricacies of the programming language of choice.

like image 95
vladzam Avatar answered Oct 06 '22 13:10

vladzam