I have two collections with a many-to-many relationship. I want to store an array of linked ObjectIds in both documents so that I can take Document A and retrieve all linked Document B's quickly, and vice versa. Creating this link is a two step process <ol> <li>Add Document A's ObjectId to Document B</li> <li>Add Document B's ObjectId to Document A</li> </ol> After watching a MongoDB video I found this to be the recommended way of storing a many-to-many relationship between two collections I need to be sure that both updates are made. What is the recommended way of robustly dealing with this crucial two step process without a transaction? I could condense this relationship into a single link collection, the advantage being a single update with no chance of Document B missing the link to Document A. The disadvantage being that I'm not really using MongoDB as intended. But, because there is only a single update, it seems more robust to have a link collection that defines the many-to-many relationship. Should I use safe mode and manually check the data went in afterwards and try again on failure? Or should I represent the many-to-many relationship in just one of the collections and rely on an index to make sure I can still quickly get the linked documents? Any recommendations? Thanks

@Gareth, you have multiple legitimate ways to do this. So they key concern is how you plan to query for the data, (i.e.: what queries need to be fast) Here are a couple of methods. Method #1: the "links" collection You could build a collection that simply contains mappings between the collections. Pros: <ul> <li>Supports atomic updates so that data is not lost</li> </ul> Cons: <ul> <li>Extra query when trying to move between collections</li> </ul> Method #2: store copies of smaller mappings in larger collection For example: you have millions of <code>Products</code>, but only a hundred <code>Categories</code>. Then you would store the <code>Categories</code> as an array inside each <code>Product</code>. Pros: <ul> <li>Smallest footprint</li> <li>Only need one update</li> </ul> Cons: <ul> <li>Extra query if you go the "wrong way"</li> </ul> Method #3: store copies of all mappings in both collections (what you're suggesting) Pros: <ul> <li>Single query access to move between either collection</li> </ul> Cons: <ul> <li>Potentially large indexes</li> <li>Needs transactions (?)</li> </ul> Let's talk about "needs transactions". There are several ways to do transactions and it really depends on what type of safety you require. <blockquote> Should I use safe mode and manually check the data went in afterwards and try again on failure? </blockquote> You can definitely do this. You'll have to ask yourself, what's the worst that happens if only one of the saves fails? Method #4: queue the change I don't know if you've ever worked with queues, but if you have some leeway you can build a simple queue and have different jobs that update their respective collections. This is a much more advanced solution. I would tend to go with #2 or #3.

Many to many update in MongoDB without transactions

Tags:

mongodb

I have two collections with a many-to-many relationship. I want to store an array of linked ObjectIds in both documents so that I can take Document A and retrieve all linked Document B's quickly, and vice versa.

Creating this link is a two step process

Add Document A's ObjectId to Document B
Add Document B's ObjectId to Document A

After watching a MongoDB video I found this to be the recommended way of storing a many-to-many relationship between two collections

I need to be sure that both updates are made. What is the recommended way of robustly dealing with this crucial two step process without a transaction?

I could condense this relationship into a single link collection, the advantage being a single update with no chance of Document B missing the link to Document A. The disadvantage being that I'm not really using MongoDB as intended. But, because there is only a single update, it seems more robust to have a link collection that defines the many-to-many relationship.

Should I use safe mode and manually check the data went in afterwards and try again on failure? Or should I represent the many-to-many relationship in just one of the collections and rely on an index to make sure I can still quickly get the linked documents?

Any recommendations? Thanks

546

asked Feb 14 '11 13:02

Typo Johnson

2 Answers

@Gareth, you have multiple legitimate ways to do this. So they key concern is how you plan to query for the data, (i.e.: what queries need to be fast)

Here are a couple of methods.

Method #1: the "links" collection

You could build a collection that simply contains mappings between the collections.

Pros:

Supports atomic updates so that data is not lost

Cons:

Extra query when trying to move between collections

Method #2: store copies of smaller mappings in larger collection

For example: you have millions of Products, but only a hundred Categories. Then you would store the Categories as an array inside each Product.

Pros:

Smallest footprint
Only need one update

Cons:

Extra query if you go the "wrong way"

Method #3: store copies of all mappings in both collections

(what you're suggesting)

Pros:

Single query access to move between either collection

Cons:

Potentially large indexes
Needs transactions (?)

Let's talk about "needs transactions". There are several ways to do transactions and it really depends on what type of safety you require.

Should I use safe mode and manually check the data went in afterwards and try again on failure?

You can definitely do this. You'll have to ask yourself, what's the worst that happens if only one of the saves fails?

Method #4: queue the change

I don't know if you've ever worked with queues, but if you have some leeway you can build a simple queue and have different jobs that update their respective collections.

This is a much more advanced solution. I would tend to go with #2 or #3.

103

answered Oct 27 '22 21:10

Gates VP

Why don't you create a dedicated collection holding the relations between A and B as dedicated rows/documents as one would do it in a RDBMS. You can modify the relation table with one operation which is of course atomic.

answered Oct 27 '22 20:10

Andreas Jung

Related questions
                            
                                Problem using easy_install on Windows 7, 64 bit. (cannot find python.exe)
                            
                                NoSQL database with high read performances (write accesses are not significant)?
                            
                                Security concerns while using MongoDB PHP driver
                            
                                How does MongoDB journaling work
                            
                                How do I count multiple keys in the same MongoDB aggregation $group query?
                            
                                Can't close a MongoDB connection with Node.js?
                            
                                MongoDB slows down every 2 hours and 10 minutes accurately
                            
                                Why is Spring Data's MongoRepository so limited?
                            
                                Convert a string to a number in MongoDB projection
                            
                                Invalid embedded document instance provided to an EmbeddedDocumentField on save
                            
                                Writing a streaming response from a streaming query in Koa with Mongoose
                            
                                MongoDb Count after aggregation C# 2.0 driver
                            
                                1GB memory allocated to "lib/ruby/2.1.0/timeout.rb"
                            
                                MongoError: cannot infer query fields to set, path 'users' is matched twice
                            
                                Handling errors with bulkinsert in Mongo NodeJS [duplicate]
                            
                                How can I use JSON.parse in nunjucks
                            
                                How to set MongoDB Change Stream 'OperationType' in the C# driver?
                            
                                MongooseError: You can not `mongoose.connect()` multiple times while connected
                            
                                Problem installing MongoDB using MacPorts
                            
                                MongoDB: $addToSet/$push document only if it doesn't already exist

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Many to many update in MongoDB without transactions

Tags:

mongodb

Typo Johnson

People also ask

2 Answers

Gates VP

Andreas Jung

Recent Activity

Donate For Us