I have two collections with a many-to-many relationship. I want to store an array of linked ObjectIds in both documents so that I can take Document A and retrieve all linked Document B's quickly, and vice versa.
Creating this link is a two step process
After watching a MongoDB video I found this to be the recommended way of storing a many-to-many relationship between two collections
I need to be sure that both updates are made. What is the recommended way of robustly dealing with this crucial two step process without a transaction?
I could condense this relationship into a single link collection, the advantage being a single update with no chance of Document B missing the link to Document A. The disadvantage being that I'm not really using MongoDB as intended. But, because there is only a single update, it seems more robust to have a link collection that defines the many-to-many relationship.
Should I use safe mode and manually check the data went in afterwards and try again on failure? Or should I represent the many-to-many relationship in just one of the collections and rely on an index to make sure I can still quickly get the linked documents?
Any recommendations? Thanks
Many to Many relationships are a type of mongodb relationship in which any two entities within a document can have multiple relationships.
You can use the updateOne() or updateMany() methods to add, update, or remove array elements based on the specified criteria. It is recommended to use the updateMany() method to update multiple arrays in a collection.
To my knowledge, there's no real 'limit' on the number of docs in a collection.. probably, it is the number of unique combinations of _id field MongoDB can generate..But that would be much larger than 500K..
@Gareth, you have multiple legitimate ways to do this. So they key concern is how you plan to query for the data, (i.e.: what queries need to be fast)
Here are a couple of methods.
Method #1: the "links" collection
You could build a collection that simply contains mappings between the collections.
Pros:
Cons:
Method #2: store copies of smaller mappings in larger collection
For example: you have millions of Products
, but only a hundred Categories
. Then you would store the Categories
as an array inside each Product
.
Pros:
Cons:
Method #3: store copies of all mappings in both collections
(what you're suggesting)
Pros:
Cons:
Let's talk about "needs transactions". There are several ways to do transactions and it really depends on what type of safety you require.
Should I use safe mode and manually check the data went in afterwards and try again on failure?
You can definitely do this. You'll have to ask yourself, what's the worst that happens if only one of the saves fails?
Method #4: queue the change
I don't know if you've ever worked with queues, but if you have some leeway you can build a simple queue and have different jobs that update their respective collections.
This is a much more advanced solution. I would tend to go with #2 or #3.
Why don't you create a dedicated collection holding the relations between A and B as dedicated rows/documents as one would do it in a RDBMS. You can modify the relation table with one operation which is of course atomic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With