Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB and "joins" [duplicate]

Tags:

mongodb

I'm sure MongoDB doesn't officially support "joins". What does this mean?

Does this mean "We cannot connect two collections(tables) together."?

I think if we put the value for _id in collection A to the other_id in collection B, can we simply connect two collections?

If my understanding is correct, MongoDB can connect two tables together, say, when we run a query. This is done by "Reference" written in http://www.mongodb.org/display/DOCS/Schema+Design.

Then what does "joins" really mean?

I'd love to know the answer because this is essential to learn MongoDB schema design. http://www.mongodb.org/display/DOCS/Schema+Design

like image 631
TK. Avatar asked Nov 01 '10 07:11

TK.


People also ask

Is MongoDB good for Joins?

But in MongoDB, databases are designed to store denormalized data. Fortunately, MongoDB Joins can be performed in MongoDB 3.2 as it introduces a new Lookup operation that can perform Join operations on Collections.

Why joins are not used in MongoDB?

One order and 8 line items are 9 rows in relational; in MongoDB we would typically just model this as a single BSON document which is an order with an array of embedded line items. So in that case, the join issue does not arise.

How do I join multiple collections in MongoDB?

For performing MongoDB Join two collections, you must use the $lookup operator. It is defined as a stage that executes a left outer join with another collection and aids in filtering data from joined documents. For example, if a user requires all grades from all students, then the below query can be written: Students.

Are joins slow in MongoDB?

The conclusion is that MongoDB joins are very brittle (when things change, application programs must be extensively recoded), and often MongoDB offers very poor performance, relative to Postgres.


6 Answers

It's no join since the relationship will only be evaluated when needed. A join (in a SQL database) on the other hand will resolve relationships and return them as if they were a single table (you "join two tables into one").

You can read more about DBRef here: http://docs.mongodb.org/manual/applications/database-references/

There are two possible solutions for resolving references. One is to do it manually, as you have almost described. Just save a document's _id in another document's other_id, then write your own function to resolve the relationship. The other solution is to use DBRefs as described on the manual page above, which will make MongoDB resolve the relationship client-side on demand. Which solution you choose does not matter so much because both methods will resolve the relationship client-side (note that a SQL database resolves joins on the server-side).

like image 106
Emil Vikström Avatar answered Sep 30 '22 21:09

Emil Vikström


The database does not do joins -- or automatic "linking" between documents. However you can do it yourself client side. If you need to do 2, that is ok, but if you had to do 2000, the number of client/server turnarounds would make the operation slow.

In MongoDB a common pattern is embedding. In relational when normalizing things get broken into parts. Often in mongo these pieces end up being a single document, so no join is needed anyway. But when one is needed, one does it client-side.

Consider the classic ORDER, ORDER-LINEITEM example. One order and 8 line items are 9 rows in relational; in MongoDB we would typically just model this as a single BSON document which is an order with an array of embedded line items. So in that case, the join issue does not arise. However the order would have a CUSTOMER which probably is a separate collection - the client could read the cust_id from the order document, and then go fetch it as needed separately.

There are some videos and slides for schema design talks on the mongodb.org web site I belive.

like image 38
dm. Avatar answered Sep 29 '22 21:09

dm.


one kind of join a query in mongoDB, is ask at one collection for id that match , put ids in a list (idlist) , and do find using on other (or same) collection with $in : idlist

u = db.friends.find({"friends": something }).toArray()
idlist= []
u.forEach(function(myDoc) { idlist.push(myDoc.id ); } )
db.family.find({"id": {$in : idlist} } )
like image 40
Sérgio Avatar answered Oct 03 '22 21:10

Sérgio


The first example you link to shows how MongoDB references behave much like lazy loading not like a join. There isn't a query there that's happening on both collections, rather you query one and then you lookup items from another collection by reference.

like image 20
Ian Mercer Avatar answered Sep 30 '22 21:09

Ian Mercer


The fact that mongoDB is not relational have led some people to consider it useless. I think that you should know what you are doing before designing a DB. If you choose to use noSQL DB such as MongoDB, you better implement a schema. This will make your collections - more or less - resemble tables in SQL databases. Also, avoid denormalization (embedding), unless necessary for efficiency reasons.

If you want to design your own noSQL database, I suggest to have a look on Firebase documentation. If you understand how they organize the data for their service, you can easily design a similar pattern for yours.

As others pointed out, you will have to do the joins client-side, except with Meteor (a Javascript framework), you can do your joins server-side with this package (I don't know of other framework which enables you to do so). However, I suggest you read this article before deciding to go with this choice.

Edit 28.04.17: Recently Firebase published this excellent series on designing noSql Databases. They also highlighted in one of the episodes the reasons to avoid joins and how to get around such scenarios by denormalizing your database.

like image 6
Salah Saleh Avatar answered Oct 01 '22 21:10

Salah Saleh


If you use mongoose, you can just use(assuming you're using subdocuments and population):

Profile.findById profileId
  .select 'friends'
  .exec (err, profile) ->
    if err or not profile
      handleError err, profile, res
    else
      Status.find { profile: { $in: profile.friends } }, (err, statuses) ->
        if err
          handleErr err, statuses, res
        else
          res.json createJSON statuses

It retrieves Statuses which belong to one of Profile (profileId) friends. Friends is array of references to other Profiles. Profile schema with friends defined:

schema = new mongoose.Schema
  # ...

  friends: [
    type: mongoose.Schema.Types.ObjectId
    ref: 'Profile'
    unique: true
    index: true
  ]
like image 1
Daniel Kmak Avatar answered Sep 29 '22 21:09

Daniel Kmak