Issue: I currently have a mongo collection with 100,000 documents. Each document has 3 fields (_id, name, age). I want to add a 4th field to each document called hashValue that stores the md5 hash value of each documents name field.
I currently can interact with my collection via the mongo shell or via Mongoose ODM as part of a nodeJS app.
Possible Solutions:
I realize this won't work (don't believe you can iterate through a cursor in this manner), but hopefully it shows what I'm trying to do.
var crypto = require('crypto');
MyCollection.find().forEach(function(el){
var hash = crypto.createHash('md5').update(el.name).digest("hex");
el.name = hash;
el.save()
});
Use mongo Shell - Almost same as above, and I realize something like the above syntax would work. Only issue is that I don't know how to create the md5 hash in the mongo shell. But I am able to iterate through each document and add a field.
(possible workaround) - The goal of this is to be able to query based off the md5 hash of a name value. I believe mongo allows you to create a hashed index (link here). Only issue is that I can't find an example of anyone using this for querying (only seems to be used for sharding) and I'm not sure if that will work later on. (Example: I want to md5 hash a name I collect from a user, and then query my mongo collection to see if I can find that md5 hash in the hashValue field)
Javascript already has md5 hash function called hex_md5. Its available in mongo console as well.
> hex_md5('john')
527bd5b5d689e2c32ae974c6229ff785
So to update records in your case you can use the following code snippet in mongo console:
db.collection.find().forEach( function(data){
data.hashValue = hex_md5(data.name);
db.collection.save(data);
});
You can iterate through cursor in mongoose using streams and update all the records using bulk.
mongoose.connection.on("open", function(err,conn) {
var bulk = MyCollection.collection.initializeUnorderedBulkOp();
MyCollection.find().stream()
.on('data', function(el){
var hash = crypto.createHash('md5').update(el.name).digest("hex");
// add document update operation to a bulk
bulk.find({'_id': el._id}).update({$set: {name: hash}});
})
.on('error', function(err){
// handle error
})
.on('end', function(){
// execute all bulk operations
bulk.execute(function (error) {
// final callback
callback();
});
});
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With