I am writing a script in Node.js which needs to do the following:
I have looked at this for some time and come to the conclusion that it is almost impossible to do this with asynchronous mongodb. The problems are multiple, but for example if you are dealing with 20,000 of these nodes then doing it async will hang the script. However doing them as a batch insert isn't feasible either due to step 4 needing to look if the object already exists or not.
It would be possible to cobble something horrible together which caches the created objects and then saves them as something like step 7, except it would be difficult because there are multiple collections that the objects are going into, and you would need to try look up objects from the cache first, then the database, at step 4. If that is the solution then I will just write off Javascript as broken and write this in perl instead. So my question is this, for something so simple as the above sequence of actions, can I somehow force mongodb to be synchronous so that my script doesn't turn into insanity? I want to be able to say document.save() (I'm using Mongoose by the way) and then have it not return until after it has actually saved.
Edit: Added code
This is called from a loop roughly 20000 times. I don't care (within reason) how long it takes, but 200,000 async calls to save hangs the script so it can't be that (it also uses over 1.5gig of ram at that point). If I cannot make hObj.save(); wait until the object is actually saved then I am going to need to write this in a more capable language.
models('hs').findOne({name: r2.$.name}, function (err, h) {
if (err) {
console.log(err);
} else {
var resultObj = createResult(meeting, r1, r2);
if (h == undefined) {
var hObj = new models('hs')({
name : r2.$.name,
results : [resultObj],
numResults : 1
});
hObj.save();
} else {
h.results.push(resultObj);
h.numResults++;
h.save();
}
}
});
From the async github page:
eachSeries(arr, iterator, callback)
The same as each, only iterator is applied to each item in arr in series. The next iterator is only called once the current one has completed. This means the iterator functions will complete in order.
So assuming you have your XML nodes in nodes
async.eachSeries(
nodes,
// This will be applied to every node in nodes
function (node, callback) {
models('hs').findOne({name: r2.$.name}, function (err, h) {
if (err) {
console.log(err);
} else {
// Async?
var resultObj = createResult(meeting, r1, r2);
if (h == undefined) {
var hObj = new models('hs')({
name : r2.$.name,
results : [resultObj],
numResults : 1
});
hObj.save(function (err, p) {
// Callback will tell async that you are done
callback();
});
} else {
h.results.push(resultObj);
h.numResults++;
h.save(function (err, p) {
// Callback will tell async that you are done
callback();
});
}
}
});
},
// This will be executed when all nodes has been processed
function (err) {
console.log('done!');
}
);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With