How does MongoDB $text search works?

Tags:

I have inserted following values in my events collection

db.events.insert(
   [
     { _id: 1, name: "Amusement Ride", description: "Fun" },
     { _id: 2, name: "Walk in Mangroves", description: "Adventure" },
     { _id: 3, name: "Walking in Cypress", description: "Adventure" },
     { _id: 4, name: "Trek at Tikona", description: "Adventure" },
     { _id: 5, name: "Trekking at Tikona", description: "Adventure" }
   ]
)

I've also created a index in a following way:

db.events.createIndex( { name: "text" } )

Now when I execute the following query (Search - Walk):

db.events.find({
    '$text': {
        '$search': 'Walk'
    },
})

I get these results:

{ _id: 2, name: "Walk in Mangroves", description: "Adventure" },
{ _id: 3, name: "Walking in Cypress", description: "Adventure" }

But when I search Trek:

db.events.find({
    '$text': {
        '$search': 'Trek'
    },
})

I get only one result:

{ _id: 4, name: "Trek at Tikona", description: "Adventure" }

So my question is why it dint resulted:

{ _id: 4, name: "Trek at Tikona", description: "Adventure" },
{ _id: 5, name: "Trekking at Tikona", description: "Adventure" }

When I searched walk it resulted the documents containing both walk and walking. But when I searched for Trek it only resulted the document including trek where it should have resulted both trek and trekking

492

asked Dec 06 '18 13:12

Sushant K

1 Answers

MongoDB text search uses the Snowball stemming library to reduce words to an expected root form (or stem) based on common language rules. Algorithmic stemming provides a quick reduction, but languages have exceptions (such as irregular or contradicting verb conjugation patterns) that can affect accuracy. The Snowball introduction includes a good overview of some of the limitations of algorithmic stemming.

Your example of walking stems to walk and matches as expected.

However, your example of trekking stems to trekk so does not match your search keyword of trek.

You can confirm this by explaining your query and reviewing the parsedTextQuery information which shows the stemmed search terms used:

db.events.find({$text: {$search: 'Trekking'} }).explain().queryPlanner.winningPlan.parsedTextQuery
{
   "terms" : [
       "trekk"
   ],
   "negatedTerms" : [ ],
   "phrases" : [ ],
   "negatedPhrases" : [ ]
}

You can also check expected Snowball stemming using the online Snowball Demo or by finding a Snowball library for your preferred programming language.

To work around exceptions that might commonly affect your use case, you could consider adding another field to your text index with keywords to influence the search results. For this example, you would add trek as a keyword so that the event described as trekking also matches in your search results.

There are other approaches for more accurate inflection which are generally referred to as lemmatization. Lemmatization algorithms are more complex and start heading into the domain of natural language processing. There are many open source (and commercial) toolkits that you may be able to leverage if you want to implement more advanced text search in your application, but these are outside the current scope of the MongoDB text search feature.

162

answered Sep 20 '22 23:09

Stennie

Related questions
                            
                                How to submit form with Multer with optional file submission?
                            
                                How to enable Auto Scaling for Provisioned read capacity in DynamoDB from java script
                            
                                Firebase admin.auth().getUser(uid) hangs (NodeJS)
                            
                                Why are Node js environment variables written in all caps and underscores?
                            
                                jsdoc and vscode: Documenting a function passed as an argument to another function
                            
                                Does Express disable CORS by default?
                            
                                Deploy PeerJS server on Heroku
                            
                                How to subtract two date time in mongodb
                            
                                Cloud Messaging in Cloud Functions: admin.messagin(...).send is not a function
                            
                                yarn run dev - cross-env: Permission denied
                            
                                TypeORM Cascade Delete
                            
                                How to write unit tests for Inquirer.js?
                            
                                SyntaxError: Invalid or unexpected token at createScript (vm.js:80:10)
                            
                                How to delete many documents from collection in MongoDB by _id in NodeJs [duplicate]
                            
                                Latest compatible version for NPM and node
                            
                                Converting 360 degree view to equirectangular in node js?
                            
                                How to get particular field value in node.js from Cloud Firestore database?
                            
                                Accurately measure a Javascript Function performance while displaying the output to user
                            
                                How to Implement real-time chatting using node.js and Socket.io in react-native?
                            
                                No node found for selector, but selector is there on HTML page

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does MongoDB $text search works?

Tags:

node.js

mongodb

mongodb-query

mongoose

Sushant K

People also ask

1 Answers

Stennie

Recent Activity

Donate For Us