Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

removing duplicate array values from mongodb

In mongodb I have collection where arrays has duplicate entries like

{
    "_id": ObjectId("57cf3cdd5f20a3b0ba009777"),
    "Chat": 6,
    "string": [
        "1348157031 Riyadh",
        " 548275320 Mohammad Sumon",
        " 1348157031 Riyadh",
        " 548275320 Mohammad Sumon",
        " 1348157031 Riyadh",
        " 1348157031 Riyadh"
    ]
}

I need to remove duplicate arrays and keep only unique array values like below.

{
    "_id": ObjectId("57cf3cdd5f20a3b0ba009777"),
    "Chat": 6,
    "string": [
        "1348157031 Riyadh",
        " 548275320 Mohammad Sumon",
    ]
}

what would be the best way to do this

thanks

like image 484
Sumon Avatar asked May 03 '26 11:05

Sumon


2 Answers

db.getCollection('Test').aggregate([{
    $unwind: '$string'},
    {
        $group: {
            _id: '$_id', 
            string: {
                $addToSet: '$string'
            }, 
            Chat: {
                $first: '$Chat'
            }
        }
    }
    ]);

O/P: here you are getting 2 "1348157031 Riyadh" because there is an extra space which defines itself as an different entity.

{
    "_id" : ObjectId("57cf3cdd5f20a3b0ba009777"),
    "string" : [ 
        " 1348157031 Riyadh", 
        " 548275320 Mohammad Sumon", 
        "1348157031 Riyadh"
    ],
    "Chat" : 6
}
like image 188
Shantanu Madane Avatar answered May 06 '26 00:05

Shantanu Madane


Mongo 3.4+ has $addFields aggregation stage, which allows you to avoid explicitly listing all the other fields to keep:

collection.aggregate([
    {"$addFields": {
        "string": {"$setUnion": ["$string", []]}
    }}
])

Just for reference, here is another (more lengthy) way that uses $replaceRoot and also doesn't require listing all possible fields:

collection.aggregate([
    {'$unwind': {
        'path': '$string',
        // output the document even if its list of books is empty
        'preserveNullAndEmptyArrays': true
    }},
    {'$group': {
        '_id': '$_id',
        'string': {'$addToSet': '$string'},
        // arbitrary name that doesn't exist on any document
        '_other_fields': {'$first': '$$ROOT'},
    }},
    {
      // the field, in the resulting document, has the value from the last document merged for the field. (c) docs
      // so the new deduped array value will be used
      '$replaceRoot': {'newRoot': {'$mergeObjects': ['$_other_fields', "$$ROOT"]}}
    },
    {'$project': {'_other_fields': 0}}
])    
like image 31
Dennis Golomazov Avatar answered May 06 '26 01:05

Dennis Golomazov