Let's say I have a mongo collection with a text index
on the itemName
field with these 3 documents:
{
_id: ...,
itemName: 'Mashed carrots with big carrot pieces',
price: 1.29
},
{
_id: ...,
itemName: 'Carrot juice',
price: 0.79
},
{
_id: ...,
itemName: 'Apple juice',
price: 1.49
}
I then exec a query like so:
db.items.find({ $text: { $search: 'Car' } }, { score: { $meta: "textScore" } }).sort( { score: { $meta: "textScore" } } );
How do I force mongo to return documents beginning with "Car" (case insensitive) before returning any other docs also containing "Car" somewhere in the itemName
string?
So I want to retrieve the docs in the following order:
[
{..., itemName: 'Carrot Juice', ...},
{..., itemName: 'Mashed carrots with big carrot pieces', ...}
]
Of course this is meant to be used in a search functionality, so it makes total sense to show the user the items starting with his search string before showing any other items after that.
Until now I was using standard regex, but the performance here is of course much worse! + since I have to search case insensitive, according to the docs, normal regex is not using any indices at all?!
EDIT:
Also, sometimes the behavior of $text
is very weird.
For example I have about 10-15 items where itemName
begins with the word "Zwiebel".
This query
db.items.find({ $text: { $search: "Zwiebel" }, supplier_id: 'iNTJHEf5YgBPicTrJ' }, { score: { $meta: "textScore" } }).sort( { score: { $meta: "textScore" } } );
works like a charm and returns all those documents, while this query
db.items.find({ $text: { $search: "Zwie" }, supplier_id: 'iNTJHEf5YgBPicTrJ' }, { score: { $meta: "textScore" } }).sort( { score: { $meta: "textScore" } } );
does not return anything! Only by changing "Zwiebel" to "Zwie" in the $search
.
I really do not understand how this is possible?!
best, P
MongoDB does not store documents in a collection in a particular order. When sorting on a field which contains duplicate values, documents containing those values may be returned in any order.
To skip records in MongoDB, use skip(). With that, to display only a specific number of records, use limit().
The aggregation framework was introduced in mongodb 2.2 . You can use the string operator "$strcasecmp" to make a case-insensitive comparison between strings. It's more recommended and easier than using regex.
Generally, the order of query operators doesn't matter...
A solution is to use the $indexOfCP
operator introcuced in MongoDB 3.4
This operator return the index of the occurence of a String in an other String, and -1 if there is no occurence
how it works:
/car/gi
(case inensitive)index
which stores the index of 'car' in itemName
index
fieldthe query would looks like this:
db.items.aggregate([
{
$match:{
itemName:/car/gi
}
},
{
$project:{
index:{
$indexOfCP:[
{
$toLower:"$itemName"
},
"car"
]
},
price:1,
itemName:1
}
},
{
$sort:{
index:1
}
}
])
and this returns:
{ "_id" : 2, "itemName" : "Carrot juice", "price" : 0.79, "index" : 0 }
{ "_id" : 1, "itemName" : "Mashed carrots with big carrot pieces", "price" : 1.29, "index" : 7 }
try it online: mongoplayground.net/p/FqqCUQI3D-E
Edit:
For the behavior of the $text
index, this is completely normal
A text index tokenize text using delimiters (default delimiters are white space and punctuation). It can only be used to search for whole worlds, and so it won't work for subpart of words
from mongodb text index documentation
$text will tokenize the search string using whitespace and most punctuation as delimiters, and perform a logical OR of all such tokens in the search string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With