Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete the documents returned by group in mongodb?

I am mongodb beginner and am working on a homework problem, the dataset looks like this

{ "_id" : { "$oid" : "50906d7fa3c412bb040eb577" }, "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb578" }, "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb579" }, "student_id" : 0, "type" : "homework", "score" : 14.8504576811645 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb57a" }, "student_id" : 0, "type" : "homework", "score" : 63.98402553675503 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb57b" }, "student_id" : 1, "type" : "exam", "score" : 74.20010837299897 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb57c" }, "student_id" : 1, "type" : "quiz", "score" : 96.76851542258362 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb57d" }, "student_id" : 1, "type" : "homework", "score" : 21.33260810416115 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb57e" }, "student_id" : 1, "type" : "homework", "score" : 44.31667452616328 }

As part of the problem I have to delete for each student, the 'homework' document with lowest score. Here is my strategy

In aggregate pipeline
1: First filter all the documents with type:homeworks
2: Sort by student_id, score
3: do a group on student_id, find the first element

This will give me all the documents with lowest score,

however how I do delete these elements from the original dataset? Any guidance or hint?

like image 727
Dude Avatar asked Jun 03 '15 09:06

Dude


People also ask

How do I remove documents from capped collection in MongoDB?

You cannot delete documents from a capped collection. It can only be deleted automatically upon insertion of new documents when the allocated size to the collection has been exhausted. After reading the documents from a capped collection, MongoDB returns the same document in the order which they were present on disk.

Which MongoDB command is used to remove document from a collection?

MongoDB's remove() method is used to remove a document from the collection. remove() method accepts two parameters. One is deletion criteria and second is justOne flag. deletion criteria − (Optional) deletion criteria according to documents will be removed.

How do I delete a post in MongoDB?

To delete a record, or document as it is called in MongoDB, we use the deleteOne() method. The first parameter of the deleteOne() method is a query object defining which document to delete.


2 Answers

Use the cursor result from the aggregation to loop through the documents with the cursor's forEach() method and then remove each document from the collection using the _id as the query in the remove() method. Something like this:

var cursor = db.grades.aggregate(pipeline);
cursor.forEach(function (doc){
    db.grades.remove({"_id": doc._id});
});

Another approach is to create an array of the document's _ids using the map() method and remove the documents like:

var cursor = db.grades.aggregate(pipeline),
    ids = cursor.map(function (doc) { return doc._id; });
db.grades.remove({"_id": { "$in": ids }});

-- UPDATE --

For large deletion operations, it may be more efficient to copy the documents that you want to keep to a new collection and then use drop() on the original collection. To copy the essential documents your aggregation pipeline needs to return the documents without the lowest homework doc and copy them to another collection using the $out operator as the final pipeline stage. Consider the following aggregation pipeline:

db.grades.aggregate([    
    {
        '$group':{
            '_id': {
                "student_id": "$student_id",
                "type": "$type"
            },
            'lowest_score': { "$min": '$score'},
            'data': {
                '$push': '$$ROOT'
            }
         }
    },    
    {
        "$unwind": "$data"
    },
    {
        "$project": {
            "_id": "$data._id",
            "student_id" : "$data.student_id",
            "type" : "$data.type",
            "score" : "$data.score",
            'lowest_score': 1,            
            "isHomeworkLowest": {
                "$cond": [
                    { 
                        "$and": [
                            { "$eq": [ "$_id.type", "homework" ] },
                            { "$eq": [ "$data.score", "$lowest_score" ] }
                        ] 
                    },
                    true,
                    false
                ]
            }
        }
    },
    {
        "$match": {"isHomeworkLowest" : false}
    },
    {
        "$project": {           
            "student_id": 1,
            "type": 1,
            "score": 1
        }
    },
    {
        "$out": "new_grades"
    }
])

in which you can then drop the old collection by db.grades.drop() and then query on db.new_grades.find()

like image 180
chridam Avatar answered Oct 08 '22 14:10

chridam


I think this is the a database part of homework of MongoDB for Java Developers provided by MongoDB University. Where the requirement is to delete the lowest score from each individual student. anyway I solved this way. I hope It will be helpful for you. You can also clone my code from my github link(Provided below)

public class Homework2Week2 {

public static void main(String[] args) {
    // TODO Auto-generated method stub
    // Here the the documentation is used for mongo-jva-driver-3.2.2.jar
    /*If you want to use different versionof  mongo-jva-driver 
      then you have look for that version specificatios.*/
    MongoClient mongoClient = new MongoClient();
    // get handle to "students" database
    MongoDatabase database = mongoClient.getDatabase("students");
    // get a handle to the "grades" collection
    MongoCollection<Document> collection = database.getCollection("grades");
    /*
     * Write a program in the language of your choice that will remove the grade of type "homework" with the lowest score for each student from the dataset in the handout. 
     * Since each document is one grade, it should remove one document per student. 
     * This will use the same data set as the last problem, but if you don't have it, you can download and re-import.
     * The dataset contains 4 scores each for 200 students.
     * First, letâs confirm your data is intact; the number of documents should be 800.

     *Hint/spoiler: If you select homework grade-documents, sort by student
      and then by score, you can iterate through and find the lowest score
      for each student by noticing a change in student id. As you notice
      that change of student_id, remove the document.
     */
    MongoCursor<Document> cursor = collection.find(eq("type", "homework")).sort(new Document("student_id", 1).append("score", 1)).iterator();
    int curStudentId = -1;
    try
    {
    while (cursor.hasNext()) {
        Document doc = cursor.next();
        int studentId=(int) doc.get("student_id");
        if (studentId != curStudentId) {
            collection.deleteMany(doc);
            curStudentId = studentId;
        }
    }
    }finally {
        //Close cursor
        cursor.close();
    }   
    //Close mongoClient
    mongoClient.close();
}

}

In my Github account I have the complete project code. If anyone want's you can try from this link.

like image 33
Humaun Rashid Nayan Avatar answered Oct 08 '22 14:10

Humaun Rashid Nayan