I am hosting a mongodb database for a service that supports full text searching on a collection with 6.8 million records.
Its text index includes ten fields with varying weights.
Most searches take less than a second. Some searches take two to three seconds. However, some searches take 15 - 60 seconds! The 15-60 second search cases are unacceptable for my application. I need to find a way to speed those up.
Searching takes 15-60 seconds when words that are very common in the index are used in the search query.
I seems that the text search feature does not support lazy parameters. My first thought was to cache a list of the 50 most common words in my text index and then ask mongodb to evaluate those last (lazy) and on top of the filtered results returned by the less common parameters. Hopefully people are still with me. For example, say I have a query "products chocolate", where products is common and chocolate is uncommon. I would like to be able to ask mongodb to evaluate "chocolate" first, and then filter those results with the "products" term. Does anyone know of a way to achieve this?
I can achieve the above scenario by omitting the most common words (i.e. "products") from the db query and then reapplying the common term filter on the application side after it has received records found by db. It is preferable for all query logic to happen on the database, but am open to application side processing for a speed payout.
There are still some holes in this design. If a user only searches common terms, I have no choice but to hit the database with all the terms. From preliminary reading, I gather that it is not recommended (or not supported) to have multiple text indexes (with different names) on the same collection. My plan is to create two identical tables, each with my 6.8M records, with different indexes - one for common words and one for uncommon words. This feels kludgy and clunky, but am willing to do this for a speed increase.
Does anyone have any insight and/or advice on how to speed up this system. I'd like as much processing to happen on the database as possible to keep it fast. I'm sure my little 6.8M record table is not the largest that mongodb has seen. Thanks!
If every “the” and every “and” is indexed, this slows down your MongoDB instances because each time data is inserted into results, the indexes are updated. Also, while it's better than having no index at all, it can take longer for MongoDB to find what you're looking for and then everything slows down.
MongoDB offers a full-text search solution, MongoDB Atlas Search, for data hosted on MongoDB Atlas.
Performance. Because the index contains all fields required by the query, MongoDB can both match the query conditions and return the results using only the index. Querying only the index can be much faster than querying documents outside of the index.
Indexes are strongly consistent with the underlying data. DynamoDB supports key-value queries only.
Well I worked around these performance issues by allowing MongoDB full text search to search in OR based format. I'm prioritizing my results by fine tuning the weights on my indexed fields and just ordering by rank. I do get more results than desired, but that's not a huge problem because my weighted results that appear at the top will most likely be consumed before my user gets to less relevant results at the bottom.
If anyone is struggling with MongoDB text search performance using AND searching only, just switch back to OR and control your results using weights. It performs leaps better.
hth
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With