Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding duplicate values in a MongoDB array

Tags:

mongodb

I have a collection containing entries in following format:

{ 
    "_id" : ObjectId("5538e75c3cea103b25ff94a3"), 
    "userID" : "USER001", 
    "userName" : "manish", 
    "collegeIDs" : [
        "COL_HARY",
        "COL_MARY",
        "COL_JOHNS",
        "COL_CAS",
        "COL_JAMES",
        "COL_MARY",
        "COL_MARY",
        "COL_JOHNS"
    ]
}

I need to find out the collegeIDs those are repeating. So the result should give "COL_MARY","COL_JOHNS" and if possible the repeating count. Please do give a mongo query to find it.

like image 933
lime_pal Avatar asked Sep 10 '15 12:09

lime_pal


People also ask

How do I find duplicate records in MongoDB?

You can find duplicate values within your MongoDB database using the aggregate method along with the $group and $match aggregation pipeline operators. For a closer look at MongoDB's aggregation pipeline operators see the article Aggregations in MongoDB by Example.

How do you find duplicate objects in an array?

To check if there were duplicate items in the original array, just compare the length of both arrays: const numbers = [1, 2, 3, 2, 4, 5, 5, 6]; const unique = Array. from(new Set(numbers)); if(numbers. length === unique.

Does MongoDB store duplicates?

However, mongodb stores duplicates with different _id s. Many solutions suggest adding unique index on collection, but it isn't possible in my case.

How do I find an element in an array in MongoDB?

To search the array of object in MongoDB, you can use $elemMatch operator. This operator allows us to search for more than one component from an array object.


1 Answers

Probably there would be many of these documents and thus you want it per ObjectId.

db.myCollection.aggregate([
  {"$project": {"collegeIDs":1}},
  {"$unwind":"$collegeIDs"},
  {"$group": {"_id":{"_id":"$_id", "cid":"$collegeIDs"}, "count":{"$sum":1}}},
  {"$match": {"count":{"$gt":1}}},
  {"$group": {"_id": "$_id._id", "collegeIDs":{"$addToSet":"$_id.cid"}}}
])

This might be what you want to, not clear from your question:

db.myCollection.aggregate([
  {"$match": {"userID":"USER001"}},
  {"$project": {"collegeIDs":1, "_id":0}},
  {"$unwind":"$collegeIDs"},
  {"$group": {"_id":"$collegeIDs", "count":{"$sum":1}}},
  {"$match": {"count":{"$gt":1}}},
])
like image 89
Cetin Basoz Avatar answered Sep 19 '22 08:09

Cetin Basoz