Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently sorting the results of a mongodb geospatial query

Tags:

mongodb

I have a very large collection of documents like:

{ loc: [10.32, 24.34], relevance: 0.434 }

and want to be able efficiently do a query like:

 { "loc": {"$geoWithin":{"$box":[[-103,10.1],[-80.43,30.232]]}} }

with arbitrary boxes.

Adding an 2d index on loc makes this very fast and efficient. However, I want to now also just get the most relevant documents:

.sort({ relevance: -1 })

Which causes everything to grind to a crawl (there can be huge amount of results in any particular box, and I just need the top 10 or so).

Any advise or help greatly appreciated!!

like image 884
Heptic Avatar asked Dec 21 '22 00:12

Heptic


1 Answers

Have you tried using the aggregation framework?

A two stage pipeline might work:

  1. a $match stage that uses your existing $geoWithin query.
  2. a $sort stage that sorts by relevance: -1

Here's an example of what it might look like:

db.foo.aggregate(
    {$match: { "loc": {"$geoWithin":{"$box":[[-103,10.1],[-80.43,30.232]]}} }},
    {$sort: {relevance: -1}}
);

I'm not sure how it will perform. However, even if it's poor with MongoDB 2.4, it might be dramatically different in 2.6/2.5, as 2.6 will include improved aggregation sort performance.

like image 98
Sean Reilly Avatar answered May 06 '23 07:05

Sean Reilly