I have a dating app, and I store all the potential Match objects in MongoDB (a Match object happens when user swipes left or right):
{
   uid1: <userid1>,
   uid2: <userid2>,
   uid1action: <L|R|E> (left/right/empty, based what the user1 has done),
   uid2action: <L|R|E> (left/right/empty, based what the user2 has done),
}
Now comes to my question. When I show profiles of potential users to user1, I take in to account all the people who already have liked user1 (because I prioritise these profiles):
var likedQuery = Parse.Query.or(new Parse.Query("Match")
    .equalTo("uid1", userId)
    .equalTo("u2action", "L")
    .equalTo("u1action", "E") // user1 has not done anything
    .select("uid2")
    .limit(paginationLimit);
Now this is nice, everything works nicely. I am now looking to also order the likedQuery by the amount of likes each user has (popularity).
Say these are the following users who have liked user1:
Paul (paul himself has had 50 people like him)
Logan (logan was liked by 20 people)
Michael (michael was liked by 80 people),
We want to order all these people such that Michael would be the first profile user1 sees.
Now my question is, how will I do it using mongoDB? In SQL this would be quite trivial, just do a table JOIN, order by that table using SUM() and COUNT(), and ensure you have necessary indexes.
In mongoDB, the only way I see how to do it is to have a uid2likes (which will be sorted on) field on each Match object that will be incremented by cron job, but that is ridiculous and doesn't scale.
My question is more about how to do this in a way that scales.
You can use below aggregation query in 3.4.
The idea here is to $match all the users who liked user1 followed by self $lookup to get all the users who liked the users who liked user1.
$group and $sort to sort the matches by count desc.
$limit to limit the matched users.
db.colname.aggregate([
  {"$match":{"uid1":userID,"uid2action":"L","uid1action":"E"}},
  {"$lookup":{
    "from":colname,
    "localField":"uid2",
    "foreignField":"uid1",
    "as":"uid2likes"
  }},
  {"$unwind":"$uid2likes"},
  {"$match":{"uid2likes.uid2action":"L"}},
  {"$group":{
    "_id":{"uid1":"$uid1","uid2":"$uid2"},
    "uid2likecount":{"$sum":1}
  }},
  {"$sort":{"uid2likecount":-1}},
  {"$limit":paginationLimit}
])
Couple of notes
It is important to use $lookup + $unwind + $match which is optimized in 3.4 to run by moving query predicate $match inside $lookup. More here
You may make use of existing index ( assuming you have one on uid1) for both initial match and lookup match.
Also try adding index on uid2action and see if it is picked up by $lookup + $match stage. More here and here
Add indexes:
db.colname.createIndex( { uid1: 1 } )
db.colname.createIndex( { uid2action: 1 } )
Measure Index Use:
db.colname.aggregate([{$indexStats: {}}, {$project: {key: 0, host: 0}}]).pretty();
Explain Query:
db.colname.explain("executionStats").aggregate(above pipeline);
You can alternate between indexes and check the execution stats to see how the indexes are being picked up. Also try compound indexes too.
Using 3.6 you can clean up the query a little bit.
db.colname.aggregate([
  {"$match":{"uid1":userID,"uid2action":"L","uid1action":"E"}},
  {"$lookup":{
    "from":colname,
    "let":{"uid2":"$uid2"},
    "pipeline":[
       {"$match":{"$expr":{"$eq":["$uid1","$$uid2"]},"uid2action":"L"}}, 
       {"$count":"count"}
    ],
    "as":"uid2likes"
  }},
  {"$unwind":"$uid2likes"},
  {"$sort":{"uid2likes.count":-1}},
  {"$limit":paginationLimit}
])
                        I would use the aggregation pipeline.
You didn't provide a lot of info about what are your schema so these are my assumptions:
match
users
users collection has a likes field which holds a Number
The query below will return a sorted list of users who voted L for userID (i.e. the current user), ascending by their total likes.
db.match.aggregate([
  { $match: { "uid1": userID, "uid1action": "E", "uid2action": "L" } },
  { $project: { _id: 0, uid2: 1 } },
  { $lookup: {
      from: "users",
      let: { uid: "$uid2" },
      pipeline: [
          { $match: { $expr: { $eq: [ "$_id", "$$uid" ] } } },
          { $project: { _id: 0, likes: 1 } },
      ],
      as: "likes" }
  },
  { $unwind: "$likes" },
  { $project: { _id: "$uid2", likes: "$likes.likes" } },
  { $sort: { likes: -1 } },
  { $limit: paginationLimit }
])
From the match collection get all document that match the criteria ( uid1 is the current user, uid1action is "E" and uid2action is "L" ).
Use only the uid2 field from every Match object.
Perform an inner search in the users collection.
Match all users whos _id equals the uid2.
Take only the likes field from every document there.
Return the results as a likes field.
Now, the last action returns a list so unwind that list.
From that take only the uid2 field as _id and likes.likes field as likes.
Sort the results according to the likes field.
Limit the results according to paginationLimit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With