How to aggregate to get every combination of two users per movie grouping key?

Question

Here is my collection:

{ "user" : 1, "rate" : 1, "movie" : 1}
{ "user" : 1, "rate" : 3, "movie" : 3}
{ "user" : 1, "rate" : 2, "movie" : 4}
{ "user" : 1, "rate" : 3, "movie" : 5}
{ "user" : 2, "rate" : 4, "movie" : 1}
{ "user" : 2, "rate" : 2, "movie" : 3}
{ "user" : 2, "rate" : 5, "movie" : 6}
{ "user" : 3, "rate" : 1, "movie" : 3}

Here is the result I want get:

{ "user1" : 1, "rate1" : 1,"user2" : 2, "rate2" : 4, "movie" : 1}
{ "user1" : 1, "rate1" : 3,"user2" : 2, "rate2" : 2, "movie" : 3}
{ "user1" : 1, "rate1" : 3,"user2" : 3, "rate2" : 1, "movie" : 3}
{ "user1" : 2, "rate1" : 2,"user2" : 3, "rate2" : 1, "movie" : 3}

For every "movie" that is every combination of the "user" and "rate" values as a pair where there is more than one "user" for that "movie"

Neil Lunn · Accepted Answer

You are looking for the "permutations" when grouping on "movie" for two or more "user" values. It's not the sort of thing you really get from a "database" query.

But if you $group on "movie" and $push the other data, you can then filter out any content with less than two entries in that array and work out your "possible combinations" from there. So you can ask the database to do the grouping and filtering, but the the rest is for algorithms in "set theory".

So this is "part" aggregation statement and "part" processing of the code in result:

db.movies.aggregate([
  { "$group": {
    "_id": "$movie",
    "people": {
      "$push": {
        "user": "$user",
        "rate": "$rate"
      }
    }
  }},
  { "$redact": {
    "$cond": {
      "if": { "$gt": [{ "$size": "$people" }, 1] },
      "then": "$$KEEP",
      "else": "$$PRUNE"
    }
  }},
  { "$sort": { "_id": 1 } }
]).forEach(function(doc) {
  var n = doc.people.length;
  var i,j;

  for (i = 0; i < n; i++) {
    for (j = i + 1; j < n; j++) {
      printjson({ 
        "user1": doc.people[i].user, 
        "rate1": doc.people[i].rate,
        "user2": doc.people[j].user,
        "rate2": doc.people[j].rate,
        "movie": doc._id
      })
    }
  }
})

So the aggregation part itself first does a $group on the "movie" values as mentioned and creates the array with $push. Since not all results are going to have an array with more than one entry, you then remove those with $redact. This is a "logical filter" that uses $size to compare the generated array and see if it has "more than" ( $gt ) one entry.

The results at this stage look like this, after also applying a $sort:

{
  "_id" : 1,
  "people" : [
    {
      "user" : 1,
      "rate" : 1
    },
    {
      "user" : 2,
      "rate" : 4
    }
  ]
}
{
  "_id" : 3,
  "people" : [
    {
      "user" : 1,
      "rate" : 3
    },
    {
      "user" : 2,
      "rate" : 2
    },
    {
      "user" : 3,
      "rate" : 1
    }
  ]
}

The next part is really up to an "algorithm" to generate the possible "pair" combinations. It's a pretty common and well known approach, so you just run the loops on the arrays of each document returned in response to produce the result:

{ "user1" : 1, "rate1" : 1, "user2" : 2, "rate2" : 4, "movie" : 1 }
{ "user1" : 1, "rate1" : 3, "user2" : 2, "rate2" : 2, "movie" : 3 }
{ "user1" : 1, "rate1" : 3, "user2" : 3, "rate2" : 1, "movie" : 3 }
{ "user1" : 2, "rate1" : 2, "user2" : 3, "rate2" : 1, "movie" : 3 }

How to aggregate to get every combination of two users per movie grouping key?

Tags:

mongodb

mongodb-query

aggregation-framework

user6148078

1 Answers

Neil Lunn

Recent Activity

Donate For Us

How to aggregate to get every combination of two users per movie grouping key?

Tags:

mongodb

mongodb-query

aggregation-framework

user6148078

1 Answers

Neil Lunn

Related questions

Recent Activity

Donate For Us