I have inserted following values in my events collection
db.events.insert(
[
{ _id: 1, name: "Amusement Ride", description: "Fun" },
{ _id: 2, name: "Walk in Mangroves", description: "Adventure" },
{ _id: 3, name: "Walking in Cypress", description: "Adventure" },
{ _id: 4, name: "Trek at Tikona", description: "Adventure" },
{ _id: 5, name: "Trekking at Tikona", description: "Adventure" }
]
)
I've also created a index in a following way:
db.events.createIndex( { name: "text" } )
Now when I execute the following query (Search - Walk):
db.events.find({
'$text': {
'$search': 'Walk'
},
})
I get these results:
{ _id: 2, name: "Walk in Mangroves", description: "Adventure" },
{ _id: 3, name: "Walking in Cypress", description: "Adventure" }
But when I search Trek:
db.events.find({
'$text': {
'$search': 'Trek'
},
})
I get only one result:
{ _id: 4, name: "Trek at Tikona", description: "Adventure" }
So my question is why it dint resulted:
{ _id: 4, name: "Trek at Tikona", description: "Adventure" },
{ _id: 5, name: "Trekking at Tikona", description: "Adventure" }
When I searched walk it resulted the documents containing both walk and walking. But when I searched for Trek it only resulted the document including trek where it should have resulted both trek and trekking
Implementing a full-text search engine in MongoDB Atlas is just a question of clicking on a button. Go to any cluster and select the “Search” tab to do so. From there, you can click on “Create Search Index” to launch the process. Once the index is created, you can use the $search operator to perform full-text searches.
$search. string. A string of terms that MongoDB parses and uses to query the text index. MongoDB performs a logical OR search of the terms unless specified as a phrase.
For a text index, the weight of an indexed field denotes the significance of the field relative to the other indexed fields in terms of the text search score. For each indexed field in the document, MongoDB multiplies the number of matches by the weight and sums the results.
MongoDB offers a full-text search solution, MongoDB Atlas Search, for data hosted on MongoDB Atlas.
MongoDB text search uses the Snowball stemming library to reduce words to an expected root form (or stem) based on common language rules. Algorithmic stemming provides a quick reduction, but languages have exceptions (such as irregular or contradicting verb conjugation patterns) that can affect accuracy. The Snowball introduction includes a good overview of some of the limitations of algorithmic stemming.
Your example of walking
stems to walk
and matches as expected.
However, your example of trekking
stems to trekk
so does not match your search keyword of trek
.
You can confirm this by explaining your query and reviewing the parsedTextQuery
information which shows the stemmed search terms used:
db.events.find({$text: {$search: 'Trekking'} }).explain().queryPlanner.winningPlan.parsedTextQuery
{
"terms" : [
"trekk"
],
"negatedTerms" : [ ],
"phrases" : [ ],
"negatedPhrases" : [ ]
}
You can also check expected Snowball stemming using the online Snowball Demo or by finding a Snowball library for your preferred programming language.
To work around exceptions that might commonly affect your use case, you could consider adding another field to your text index with keywords to influence the search results. For this example, you would add trek
as a keyword so that the event described as trekking
also matches in your search results.
There are other approaches for more accurate inflection which are generally referred to as lemmatization. Lemmatization algorithms are more complex and start heading into the domain of natural language processing. There are many open source (and commercial) toolkits that you may be able to leverage if you want to implement more advanced text search in your application, but these are outside the current scope of the MongoDB text search feature.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With