Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB full word search with exact phrase not returning expected results

So according to MongoDB docs,

if a document field contains the word blueberry, a search on the term blue will not match the document

This is good for my use case, it's what I want to occur. However given the following DB entries:

> db.test.drop()
> db.test.insert({ "t" : "Men's Fashion" })
> db.test.insert({ "t" : "Women's Fashion" })
> db.test.ensureIndex({ "t" : "text" })

A search for Men's returns the expected results:

> db.test.find({ "$text" : { "$search" : "\"Men's\"" } }, { "_id" : 0 })
{ "t" : "Men's Fashion" }

However a search for the whole phrase Men's Fashion, unexpectedly also returns Women's Fashion:

> db.test.find({ "$text" : { "$search" : "\"Men's Fashion\"" } }, { "_id" : 0 })
{ "t" : "Women's Fashion" }
{ "t" : "Men's Fashion" }

I've tried "\"Men's\"\"Fashion\"" as well with the same results. Is there a workaround/trick to get the full phrase to only return whole word matches?

I'm using Mongo 2.6.4. Interestingly, it does score Women's lower than Men's.

like image 440
JBY Avatar asked Jul 27 '15 17:07

JBY


1 Answers

The results you see are because woMEN'S FASHION matches MEN'S FASHION in the sense that the search string is in the string to be searched.

The match behavior does not occur with this dataset:

/* 1 */
{
    "_id" : ObjectId("55ca6060fb286267994d297e"),
    "text" : "potato pancake"
}

/* 2 */
{
    "_id" : ObjectId("55ca6075fb286267994d297f"),
    "text" : "potato salad"
}

/* 3 */
{
    "_id" : ObjectId("55ca612ffb286267994d2980"),
    "text" : "potato's pancake"
}

with the query

db.getCollection('rudy_test').find({$text : {"$search" : "\"potato pancake\""}})

It's caused by the fact that the entry does contain the entire query, the score is just lower because it contains other text as well. You could use a Regular Expression Query (i.e. db.test.find({t : {$regex : /^Men\'s Fashion$/}})) instead.

like image 72
Jason Nichols Avatar answered Sep 28 '22 02:09

Jason Nichols