Here is my MongoDB shell session;
> db.foo.save({path: 'a:b'})
WriteResult({ "nInserted" : 1 })
> db.foo.findOne()
{ "_id" : ObjectId("58fedc47622e89329d123ee8"), "path" : "a:b" }
> db.foo.save({path: 'a:b:c'})
WriteResult({ "nInserted" : 1 })
> db.foo.find({path: /a:[^:]+/})
{ "_id" : ObjectId("58fedc47622e89329d123ee8"), "path" : "a:b" }
{ "_id" : ObjectId("58fedc57622e89329d123ee9"), "path" : "a:b:c" }
> db.foo.find({path: /a:[a-z]+/})
{ "_id" : ObjectId("58fedc47622e89329d123ee8"), "path" : "a:b" }
{ "_id" : ObjectId("58fedc57622e89329d123ee9"), "path" : "a:b:c" }
Clearly the regex /a:[^:]+/
and /a:[a-z]+/
shouldn't match string 'a:b:c'
, but looks like Mongo failed on this regex, does anyone know what happened here?
It was submitted to MongoDB Jira, as a bug ticket, so is it a bug within MongoDB querying structure?
MongoDB provides the functionality to search a pattern in a string during a query by writing a regular expression. A regular expression is a generalized way to match patterns with sequences of characters. MongoDB uses Perl compatible regular expressions(PCRE) version 8.42 along with UTF-8 support.
MongoDB also provides functionality of regular expression for string pattern matching using the $regex operator. MongoDB uses PCRE (Perl Compatible Regular Expression) as regular expression language. Unlike text search, we do not need to do any configuration or command to use regular expressions.
Create a Wildcard Index on All Fields With this wildcard index, MongoDB indexes all fields for each document in the collection. If a given field is a nested document or array, the wildcard index recurses into the document/array and stores the value for all fields in the document/array.
The trouble is with the partial matching, since you are not restricting the regex for the whole word, the partial match that exists in a:b:c
that is a:b
is resulting in you getting that document.
Use the following regex with ^$
that are anchors to represent beginning and the end of the word;
db.foo.find({path: /^a:[^:]+$/})
db.foo.find({path: /^a:[a-z]+$/})
This will make the regex apply for the whole string, and ignore the partial matches as explained above. For more on regex anchors, click here.
So, in summary, there is no bug, just a misuse of regex.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With