Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove documents with broken reference in MongoDB?

Tags:

mongodb

I have two collections in Mongo:

db.user.find():
{
  "_id": { "$oid" : "52db05e6a2cb2f36afd63c47" },
  "name": "John",
  "authority_id": { "$oid" : "52daf174a2cb2f62aed63af3" },
}
{
  "_id": { "$oid" : "52db05e6a2cb2f36afd63d00" },
  "name": "Joe",
  "authority_id": { "$oid" : "52daf174a2cb2f62aed63af3" },
}

and

db.authority.find():
{
  "_id": { "$oid" : "52daf174a2cb2f62aed63af3" },
  "name": "Sample Authority"
}

Users store reference to authority's ID through ObjectId.

Now my problem: Several authorities have been deleted and are no longer in collection. I need to find a way how to iterate through the "user" collection and delete them if their authority_id is pointing to deleted authority.

I have tried this:

db.user.find(
    { 
      $where: function() { 
        db.authority.find({ _id: this.authority_id }).count() == 0  
      }
     })

but "db" is not accessible there. Is it possible to implement reference check inside iteration?

like image 710
romaninsh Avatar asked Jan 19 '14 00:01

romaninsh


People also ask

How do I remove documents from capped collection in MongoDB?

You cannot delete documents from a capped collection. It can only be deleted automatically upon insertion of new documents when the allocated size to the collection has been exhausted. After reading the documents from a capped collection, MongoDB returns the same document in the order which they were present on disk.

How do I delete a file in MongoDB after time?

TTL index. TTL (Time-To-Live) indexes are special single-field indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time. A background thread in mongod reads the values in the index and removes expired documents from the collection (usually every minute).

What is the fastest operation to clear an entire collection in MongoDB?

drop() will delete to whole collection (very fast) and all indexes on the collection.


2 Answers

You can use an aggregate to find all orphan users and then remove them.

const orphanUsers = db.user.aggregate([
    {
      // Join authority collection using authority_id
      $lookup: {
        from: "authority",
        localField: "authority_id",
        foreignField: "_id",
        as: "authority"
      }
    },
    // filter users without authority (means authority_id doesn't exist)
    { $match: { authority: [] } },
    // return only the _id
    { $project: { _id: "$_id" } }
])

// Delete all orphan users
db.user.deleteMany({
    _id: { $in: orphanUsers.map(({ _id }) => _id) }
})
like image 59
Rui Castro Avatar answered Oct 05 '22 14:10

Rui Castro


You can remove broken entries by iterating over cursor on the javascript shell or by using any Mongo driver. The following example will give you an idea to do it on javascript shell.

db.user.find().forEach( function(myDoc) {
    var cursor = db.authority.find({'_id' : myDoc.authority_id});
    if(cursor.hasNext() == false) {
        db.user.remove({_id : myDoc._id});
    }
});
like image 36
Parvin Gasimzade Avatar answered Oct 05 '22 14:10

Parvin Gasimzade