Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB - Unwind array using aggregation and remove duplicates

Tags:

mongodb

I am unwinding an array using MongoDB aggregation framework and the array has duplicates and I need to ignore those duplicates while doing a grouping further.

How can I achieve that?

like image 612
l a s Avatar asked Sep 14 '13 17:09

l a s


People also ask

How do I unwind multiple arrays in MongoDB?

To unwind, use $unwind. The $unwind deconstructs an array field from the input documents to output a document for each element.

How do I remove duplicates in MongoDB?

General idea is to use findOne https://docs.mongodb.com/manual/reference/method/db.collection.findOne/ to retrieve one random id from the duplicate records in the collection. Delete all the records in the collection other than the random-id that we retrieved from findOne option.

How do I use distinct aggregation in MongoDB?

You can use $addToSet with the aggregation framework to count distinct objects. Not a generic solution, if you have a large number of unique zip codes per result, this array would be very large.


2 Answers

you can use $addToSet to do this:

db.users.aggregate([   { $unwind: '$data' },   { $group: { _id: '$_id', data: { $addToSet: '$data' } } } ]); 

It's hard to give you more specific answer without seeing your actual query.

like image 116
Roman Pekar Avatar answered Oct 08 '22 22:10

Roman Pekar


You have to use $addToSet, but at first you have to group by _id, because if you don't you'll get an element per item in the list.

Imagine a collection posts with documents like this:

{      body: "Lorem Ipsum...",       tags: ["stuff", "lorem", "lorem"],      author: "Enrique Coslado" } 

Imagine you want to calculate the most usual tag per author. You'd make an aggregate query like that:

db.posts.aggregate([     {$project: {         author: "$author",          tags: "$tags",          post_id: "$_id"     }},       {$unwind: "$tags"},       {$group: {         _id: "$post_id",          author: {$first: "$author"},          tags: {$addToSet: "$tags"}     }},       {$unwind: "$tags"},      {$group: {         _id: {             author: "$author",             tags: "$tags"         },         count: {$sum: 1}     }} ]) 

That way you'll get documents like this:

{      _id: {          author: "Enrique Coslado",           tags: "lorem"      },      count: 1 } 
like image 38
Enrique Coslado Avatar answered Oct 08 '22 23:10

Enrique Coslado