Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Write a Self Join Query

I have simple collection transactions which hold the information of user and restaurant

{ "user_id" : "U1", "restaurant_id" : "R_1" }
{ "user_id" : "U2", "restaurant_id" : "R_1" }
{ "user_id" : "U1", "restaurant_id" : "R_3" }
{ "user_id" : "U1", "restaurant_id" : "R_4" }
{ "user_id" : "U2", "restaurant_id" : "R_4" }

Here I need to find related restaurant between users having user_id U1 and U2 (i.e I want to find those restaurant where U1 and U2 both have visited)

I should received the output like this:-

{ "_id" : "R_4", "users" : [ "U2", "U1" ] }
{ "_id" : "R_1", "users" : [ "U2", "U1" ] }

That mean restaurant R_1 and R_4 has been visited by both user U1 and U2

I'm new to mongoDb so after googling I have written sample query which is not working

db.transactions.aggregate([
    {$match: {"user_id": {
        "$in": [ U1, U2]
    }}},
    {
        $lookup: {
           from: "transactions",
           localField: "restaurant_id",
           foreignField: "restaurant_id",
           as: "related_taste"
         }
    }
])
like image 687
Aman Maurya Avatar asked Mar 08 '23 20:03

Aman Maurya


1 Answers

What you want is the "union" of results, which goes like this:

db.transactions.aggregate([
    { "$match": { "user_id": { "$in": [ "U1", "U2" ] } }},
    { "$group": {
      "_id": "$restaurant_id",
      "users": { "$addToSet": "$user_id" }
    }},
    { "$match": { "users": { "$all": [ "U1", "U2" ] } } }
])

Which gives the output:

{ "_id" : "R_4", "users" : [ "U2", "U1" ] }
{ "_id" : "R_1", "users" : [ "U2", "U1" ] }

How this works is that the $group stage accumulates on the restaurant_id values and retains the "set" via $addToSet of the user_id values that were present for that grouping key.

Then we $match again using the $all condition to see that "both" the supplied user_id values were present in the restaurants that we gathered the "set" for.

So any places visited by only "one" of the listed users are discarded, and we get the results that are just those visited by both.


Given a correction on your data:

{ "user_id" : "U1", "restaurant_id" : "R_1" }
{ "user_id" : "U2", "restaurant_id" : "R_1" }
{ "user_id" : "U1", "restaurant_id" : "R_3" }
{ "user_id" : "U1", "restaurant_id" : "R_4" }
{ "user_id" : "U2", "restaurant_id" : "R_4" }
like image 140
Neil Lunn Avatar answered Mar 11 '23 08:03

Neil Lunn