Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB text index search slow for common words in large table

I am hosting a mongodb database for a service that supports full text searching on a collection with 6.8 million records.

Its text index includes ten fields with varying weights.

index specification

Most searches take less than a second. Some searches take two to three seconds. However, some searches take 15 - 60 seconds! The 15-60 second search cases are unacceptable for my application. I need to find a way to speed those up.

Searching takes 15-60 seconds when words that are very common in the index are used in the search query.

I seems that the text search feature does not support lazy parameters. My first thought was to cache a list of the 50 most common words in my text index and then ask mongodb to evaluate those last (lazy) and on top of the filtered results returned by the less common parameters. Hopefully people are still with me. For example, say I have a query "products chocolate", where products is common and chocolate is uncommon. I would like to be able to ask mongodb to evaluate "chocolate" first, and then filter those results with the "products" term. Does anyone know of a way to achieve this?

I can achieve the above scenario by omitting the most common words (i.e. "products") from the db query and then reapplying the common term filter on the application side after it has received records found by db. It is preferable for all query logic to happen on the database, but am open to application side processing for a speed payout.

There are still some holes in this design. If a user only searches common terms, I have no choice but to hit the database with all the terms. From preliminary reading, I gather that it is not recommended (or not supported) to have multiple text indexes (with different names) on the same collection. My plan is to create two identical tables, each with my 6.8M records, with different indexes - one for common words and one for uncommon words. This feels kludgy and clunky, but am willing to do this for a speed increase.

Does anyone have any insight and/or advice on how to speed up this system. I'd like as much processing to happen on the database as possible to keep it fast. I'm sure my little 6.8M record table is not the largest that mongodb has seen. Thanks!

like image 590
kmehta Avatar asked Jul 22 '13 16:07

kmehta


People also ask

Do indexes slow down inserts in MongoDB?

If every “the” and every “and” is indexed, this slows down your MongoDB instances because each time data is inserted into results, the indexes are updated. Also, while it's better than having no index at all, it can take longer for MongoDB to find what you're looking for and then everything slows down.

Can MongoDB do full-text search?

MongoDB offers a full-text search solution, MongoDB Atlas Search, for data hosted on MongoDB Atlas.

How can indexes speed up queries in MongoDB?

Performance. Because the index contains all fields required by the query, MongoDB can both match the query conditions and return the results using only the index. Querying only the index can be much faster than querying documents outside of the index.

Are MongoDB indexes strongly consistent?

Indexes are strongly consistent with the underlying data. DynamoDB supports key-value queries only.


1 Answers

Well I worked around these performance issues by allowing MongoDB full text search to search in OR based format. I'm prioritizing my results by fine tuning the weights on my indexed fields and just ordering by rank. I do get more results than desired, but that's not a huge problem because my weighted results that appear at the top will most likely be consumed before my user gets to less relevant results at the bottom.

If anyone is struggling with MongoDB text search performance using AND searching only, just switch back to OR and control your results using weights. It performs leaps better.

hth

like image 80
kmehta Avatar answered Oct 20 '22 22:10

kmehta