Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to relate data in MongoDB?

Tags:

mongodb

nosql

I am storing a string in a database along with the owners of the string (one or more owners per string).

I've always worked with MySQL which is a conventional relational database. In that case, I would store the string along with a unique id in one table, and then the unique id of the string along with the owners (as multiple records) in a second table.

I could then fetch strings by owners using a SQL Join.

I am now working on a project using MongoDB, and I'm doing the same as above.

Would this be considered the wrong way when working with NoSQL databases? Should I not be thinking in terms of 'relations' when working with NoSQL?

Another way I can think of achieving the same in MongoDB is storing it like this:

{
    "string": "foobar",
    "owners": [
        "owner1",
        "owner2",
        "owner3"
    ]
}

However, in this case, I'm unsure how I would search for "all strings owned by owner1".

like image 566
xbonez Avatar asked Jun 24 '12 05:06

xbonez


People also ask

Can you do Relations in MongoDB?

MongoDB Relationships are the representation of how the multiple documents are logically connected to each other in MongoDB. The Embedded and Referenced methods are two ways to create such relationships.

How do I create a relation in MongoDB?

To create a relationship in MongoDB, either embed a BSON document within another, or reference it from another. MongoDB databases work differently to relational databases. This is also true of relationships.

Can MongoDB handle relational data?

MongoDB is a NoSQL solution so doesn't require a relational database management system or RDBMS. However, there might be times when you want to integrate MongoDB with a relational database. For example, if you want to generate data visualizations about information from two disparate sources.

How do I link two documents in MongoDB?

We can join documents on collections in MongoDB by using the $lookup (Aggregation) function. $lookup(Aggregation) creates an outer left join with another collection and helps to filter data from merged data.


2 Answers

Would this be considered the wrong way when working with NoSQL databases? Should I not be thinking in terms of 'relations' when working with NoSQL?

There are so many questions on the case of embedding and it comes down to so little.

Somethings that have not been mentioned here that need to be considered if you wish to embed:

  • Will the document size be increasing massively? If so then the document might frequently move on disk, this is a bad thing.
  • Will the related row have a many join to the collection I am working on (i.e. video cannot embed user). If this is the case you might get problems when copying redundant data over from the related row into the subdocument, especially on updating that redundant data.
  • How will I need to display these results?

Displaying the results is always a key decider in whether or not to embed. If you need to paginated a high number of rows, say 1000, you will need to use the $slice operator in either normal querying or the aggregation framework. At 1000 I admit it may be quite fast but sooner or later that in-memory operation will become slower than normal querying (infact it always should be).

If you require complex sorting and displaying of the subdocuments you might wanna split these out and instead have the document structure of:

{
    "string": "foobar",
    "owners": [
        ObjectId(),
        ObjectId(),
        ObjectId()
    ]
}

I think this may actually be a more performant structure anyway for your data since the owner sounds like a user row in a users collection.

Instead of populating the subdocuments with possibly changing data of the user you can just reference their _id. This is pretty kool since you can embed the relationship but at the same time the document will only grow very little which hopefully means a low chance of constant disk movement, not only that but a smaller working set creating a more performant operation overall. Not only that but of course the _id of a owner is rarely going to change so the only operations you will need to most likely throw at this subset of data is create and delete.

Getting back to complex sorting and pagination. With this data you can of course get all owner ids with a single round trip and then in another round trip you can query for those owners rows within the users table with normal querying using an $in allowing for the complex display you require.

So this structure overall, I have found, is very performant.

Of course this structure depends on your querying, it might be better to instead house the string id on the user but in this case it doesn't since a user presumably can own many strings as such I would say it is a many->many relationship embedded on the string side.

Hopefully this helps and I haven't gone round in circles,

like image 200
Sammaye Avatar answered Oct 30 '22 02:10

Sammaye


To complement dbaseman's answer:

Yes, your approach seems ok. You can easily search for "all strings owned by owner1"

db.collection.find({owners: 'author1'})

This is possible because mongodb treats arrays in a special way.

like image 34
Sergio Tulentsev Avatar answered Oct 30 '22 01:10

Sergio Tulentsev