MongoDB text index search slow for common words in large table

Tags:

I am hosting a mongodb database for a service that supports full text searching on a collection with 6.8 million records.

Its text index includes ten fields with varying weights.

index specification

Most searches take less than a second. Some searches take two to three seconds. However, some searches take 15 - 60 seconds! The 15-60 second search cases are unacceptable for my application. I need to find a way to speed those up.

Searching takes 15-60 seconds when words that are very common in the index are used in the search query.

I seems that the text search feature does not support lazy parameters. My first thought was to cache a list of the 50 most common words in my text index and then ask mongodb to evaluate those last (lazy) and on top of the filtered results returned by the less common parameters. Hopefully people are still with me. For example, say I have a query "products chocolate", where products is common and chocolate is uncommon. I would like to be able to ask mongodb to evaluate "chocolate" first, and then filter those results with the "products" term. Does anyone know of a way to achieve this?

I can achieve the above scenario by omitting the most common words (i.e. "products") from the db query and then reapplying the common term filter on the application side after it has received records found by db. It is preferable for all query logic to happen on the database, but am open to application side processing for a speed payout.

There are still some holes in this design. If a user only searches common terms, I have no choice but to hit the database with all the terms. From preliminary reading, I gather that it is not recommended (or not supported) to have multiple text indexes (with different names) on the same collection. My plan is to create two identical tables, each with my 6.8M records, with different indexes - one for common words and one for uncommon words. This feels kludgy and clunky, but am willing to do this for a speed increase.

Does anyone have any insight and/or advice on how to speed up this system. I'd like as much processing to happen on the database as possible to keep it fast. I'm sure my little 6.8M record table is not the largest that mongodb has seen. Thanks!

590

asked Jul 22 '13 16:07

kmehta

1 Answers

Well I worked around these performance issues by allowing MongoDB full text search to search in OR based format. I'm prioritizing my results by fine tuning the weights on my indexed fields and just ordering by rank. I do get more results than desired, but that's not a huge problem because my weighted results that appear at the top will most likely be consumed before my user gets to less relevant results at the bottom.

If anyone is struggling with MongoDB text search performance using AND searching only, just switch back to OR and control your results using weights. It performs leaps better.

hth

answered Oct 20 '22 22:10

kmehta

Related questions
                            
                                Efficient file buffering & scanning methods for large files in python
                            
                                Are Amazon's micro instances (Linux, 64bit) good for MongoDB servers?
                            
                                Django is sooo slow? errno 32 broken pipe? dcramer-django-sentry-? static folder?
                            
                                What is jQuery $.fly plugin used for?
                            
                                Managed to unmanaged overhead
                            
                                Low performance of Incremental linking in Visual Studio C++
                            
                                Sensible buffer size when downloading files in Java
                            
                                Do I get a performance penalty when mixing SIMD instructions and multithreading
                            
                                How to decrease VB6 project startup time / Pinpointing what's taking so long
                            
                                For loop improved with ">>>" operator?
                            
                                How can I profile booting of a Rails application?
                            
                                Price of switching control between C++ and Python
                            
                                Subversion unbearably slow on Windows 7
                            
                                Why is the STL priority_queue not much faster than multiset in this case?
                            
                                Which one is fast, Abstract class or Interface? [duplicate]
                            
                                jstree performance issues
                            
                                Group an iterable by a predicate in Python
                            
                                Python subprocess check_output much slower then call
                            
                                How to explain the difference of performance in these 2 simple loops?
                            
                                Get the closest color name depending on an hex-color

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

MongoDB text index search slow for common words in large table

Tags:

performance

full-text-search

mongodb

lazy-evaluation

kmehta

People also ask

1 Answers

kmehta

Recent Activity

Donate For Us