This question is closely related to this one and I will consider the advice given with respect to schema design in a NoSQL context, yet I'm curious to understand this:
Suppose you have the following document:
_id : 2 abcd
name : 2 unittest.com
paths : 4
0 : 3
path : 2 home
queries : 4
0 : 3
name : 2 query1
url : 2 www.unittest.com/home?query1
requests: 4
1 : 3
name : 2 query2
url : 2 www.unittest.com/home?query2
requests: 4
Basically, I'd like to know
if it is possible to use MongoDB's positional $
operator (details) multiple times, or put differently, in update scenarios that involve array/document structures with a "degree of nestedness" greater than 1:
{ <update operator>: { "paths.$.queries.$.requests" : value } }
(doesn't work)
instead of "only" be able to use $
once for a top-level array and being bound to use explicit indexes for arrays on "higher levels":
{ <update operator>: { "paths.$.queries.0.requests" : value } }
) (works)
if possible at all, how the corresponding R syntax would look like.
Below you'll find a reproducible example. I tried to be as concise as possible.
require("rmongodb")
db <- "__unittest"
ns <- paste(db, "hosts", sep=".")
# CONNCETION OBJECT
con <- mongo.create(db=db)
# ENSURE EMPTY DB
mongo.remove(mongo=con, ns=ns)
q <- list("_id"="abcd")
b <- list("_id"="abcd", name="unittest.com")
mongo.insert(mongo=con, ns=ns, b=b)
q <- list("_id"="abcd")
b <- list("$push"=list(paths=list(path="home")))
mongo.update(mongo=con, ns, criteria=q, objNew=b)
q <- list("_id"="abcd", paths.path="home")
b <- list("$push"=list("paths.$.queries"=list(
name="query1", url="www.unittest.com/home?query1")))
mongo.update(mongo=con, ns, criteria=q, objNew=b)
b <- list("$push"=list("paths.$.queries"=list(
name="query2", url="www.unittest.com/home?query2")))
mongo.update(mongo=con, ns, criteria=q, objNew=b)
This works, but it involves an explicit index for the second-level array queries
(nested in a subdoc element of array paths
):
q <- list("_id"="abcd", paths.path="home", paths.queries.name="query1")
b <- list("$push"=list("paths.$.queries.0.requests"=list(time="2013-02-13")))
> mongo.bson.from.list(b)
$push : 3
paths.$.queries.0.requests : 3
time : 2 2013-02-13
mongo.update(mongo=con, ns, criteria=q, objNew=b)
res <- mongo.find.one(mongo=con, ns=ns, query=q)
> res
_id : 2 abcd
name : 2 unittest.com
paths : 4
0 : 3
path : 2 home
queries : 4
0 : 3
name : 2 query1
requests : 4
0 : 3
time : 2 2013-02-13
url : 2 www.unittest.com/home?query1
1 : 3
name : 2 query2
url : 2 www.unittest.com/home?query2
$
indexes (doesn't work)Now, I'd like to substitute the explicit 0
with the positional $
operator just like I did in order to have the server find the desired subdoc element of array paths
(paths.$.queries
).
AFAIU the documentation, this should work as the crucial thing is to specify a "correct" query selector:
The positional $ operator, when used with the update() method and acts as a placeholder for the first match of the update query selector:
I think I specified a query selector that does find the correct nested element (due to the paths.queries.name="query1"
part):
q <- list("_id"="abcd", paths.path="home", paths.queries.name="query1")
I guess translated to "plain MongoDB" syntax, the query selector looks somewhat like this
{ _id: abcd, paths.path: home, paths.queries.name: query1 }
which seems like a valid query selector to me. In fact it does match the desired element/doc:
> !is.null(mongo.find.one(mongo=con, ns=ns, query=q))
[1] TRUE
My thought was that if it works on the top-level, why shouldn't it work for higher levels as well (as long as the query selector points to the right nested components)?
However, the server doesn't seem to like a nested or multiple use of $
:
b <- list("$push"=list("paths.$.queries.$.requests"=list(time="2013-02-14")))
> mongo.bson.from.list(b)
$push : 3
paths.$.queries.$.requests : 3
time : 2 2013-02-14
> mongo.update(mongo=con, ns, criteria=q, objNew=b)
[1] FALSE
I'm not sure if it doesn't work because MongoDB doesn't support this or if I didn't get the R syntax right.
In case you can execute your query from the MongoDB shell you can bypass this limitation by taking advantage of MongoDB cursor's forEach function (http://docs.mongodb.org/manual/reference/method/cursor.forEach/)
Here is an example with 3 nested arrays:
var collectionNameCursor = db.collection_name.find({...});
collectionNameCursor.forEach(function(collectionDocument) {
var firstArray = collectionDocument.firstArray;
for(var i = 0; i < firstArray.length; i++) {
var secondArray = firstArray[i].secondArray;
for(var j = 0; j < secondArray.length; j++) {
var thirdArray = secondArray[j].thirdArray;
for(var k = 0; k < thirdArray.length; k++) {
//... do some logic here with thirdArray's elements
db.collection_name.save(collectionDocument);
}
}
}
});
Note that this is more of a one time solution then a production code but it's going to do the job if you have to write a fix-up script.
The positional operator only supports one level deep and only the first matching element.
There is a JIRA trackable for the sort of behaviour you want here: https://jira.mongodb.org/browse/SERVER-831
I am unsure if it will allow for more than one match but I believe it will due to the dynamics of how it will need to work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With