Is there a way to convert a nested document structure into an array? Below is an example:
Input
"experience" : {
"0" : {
"duration" : "3 months",
"end" : "August 2012",
"organization" : {
"0" : {
"name" : "Bank of China",
"profile_url" : "http://www.linkedin.com/company/13801"
}
},
"start" : "June 2012",
"title" : "Intern Analyst"
}
},
Expected Output:
"experience" : [
{
"duration" : "3 months",
"end" : "August 2012",
"organization" : {
"0" : {
"name" : "Bank of China",
"profile_url" : "http://www.linkedin.com/company/13801"
}
},
"start" : "June 2012",
"title" : "Intern Analyst"
}
],
Currently I am using a script to iterate over each element, convert them to an array & finally update the document. But it is taking a lot of time, is there a better way of doing this?
You still need to iterate over the content, but instead you should be writing back using bulk operations:
Either for MongoDB 2.6 and greater:
var bulk = db.collection.initializeUnorderedBulkOp(),
count = 0;
db.collection.find({
"$where": "return !Array.isArray(this.experience)"
}).forEach(function(doc) {
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "experience": [doc.experience["0"]] }
});
count++;
// Write once in 1000 entries
if ( count % 1000 == 0 ) {
bulk.execute();
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// Write the remaining
if ( count % 1000 != 0 )
bulk.execute();
Or in modern releases of MongoDB 3.2 and greater, the bulkWrite()
method is preferred:
var ops = [];
db.collection.find({
"$where": "return !Array.isArray(this.experience)"
}).forEach(function(doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": { "$set": { "experience": [doc.experience["0"]] } }
}
});
if ( ops.length == 1000 ) {
db.collection.bulkWrite(ops,{ "ordered": false })
ops = [];
}
})
if ( ops.length > 0 )
db.collection.bulkWrite(ops,{ "ordered": false });
So when writing back to the database over a cursor, then bulk write operations with "unordered" set is the way to go. It's only one write/response per batch of 1000 requests, which reduces a lot of overhead, and "unordered" means that writes can happen in parallel rather than in a serial order. It all makes it faster.
For mongoDB version >4.2 :
db.doc.aggregate([{ $match: {'experience.0': { $exists: false } } },
{$project:{experience:["$experience.0"]}}, { $merge: { into: "doc", on: "_id" }
])
Note : Here we're merging the updated field/document with existing, but not replacing/updating entire document, default behavior of $merge
is merge
whenMatched document is found, You can pass other options like replace/keepExisting etc.
Ref: $merge
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With