What is a good MongoDB document structure for most efficient querying of user followers/followees?

Tags:

I've been wondering about the ideal document structure for maximum query efficiency for various situations and there's one I want to ask about. It's really borne out of me not really knowing how MongoDB behaves in memory in this specific kind of case. Let me give you a hypothetical scenario.

Imagine a Twitter-style system of Followers and Followees. After an admittedly cursory glance, the main options appear to be:

In each user document, a "followers" array containing references to all the documents of other users they follow. Followees are found by finding our current user in other users' "user.followers" array. The main downside would appear to be the potential query overhead of the Followee search. Also, for a query specifically for the contents of "user.followers", does MongoDB just access the required field in users' documents, or is the whole user document found and then the required field values looked up from there and is this cached/stored in such a way that a query over a large user base would require significantly more memory?
In each user document, storing both "followers" and "followees" for quicker access to each. This obviously has the downside of duplicate data in the sense that an entry for user A following user B exists in both user documents in the respective field, and deletion from from requires a matching deletion in the other. Technically, this could be considering doubling number of points of potential failure for a simple deletion. And does MongoDB still suffer from what I've heard described as "swiss cheesing" of it's memory-stored data when deletions occur, and so removals from the 2 fields rather than 1 doubles the effect of that memory hole problem?
A separate collection for storing users' Followers, queried in a similar fashion to the user documents in 1- except that obviously the only data being accessed is Followers so if the user documents contain quite a lot of other data relevant to each user, we avoid accessing that data. This seems to have something of a relational database feel to it though and while I know that's not always a terrible approach just on principle, obviously if one of the other approaches mentioned (or one I haven't considered) is better under Mongo's architecture I'd love to learn!

If anyone has any thoughts on this, or wants to tell me I've missed a very relevant and and obvious docs page somewhere, or even wants to tell me that I'm just being stupid (thought with an explanation of why, please ;) ) I'd love to hear from you!

276

asked Jul 16 '12 08:07

tdous

2 Answers

This is a classic follower-followee problem and there's no one answer to it..Check out this link:

mongo db design of following and feeds, where should I embed?

Actually this situation lends itself very well to a relational schema, if MongoDB and SQL server were the only choices you had. But this is a special type of relational problem wherein you have a two-way relationship. This can perhaps be better handled by a graph database:

http://forum.kohanaframework.org/discussion/10130/followers-and-following-database-design-like-twitter/p1

The thing is, you could either keep followers or followees in a User document, but not both, for avoiding double deletion issues. So if you must stick to MongoDB, one way out could be..(assuming people don't follow/unfollow anyone that frequently),

Keep just the followees in the document, because when I view my profile, I'd be interested in the people I follow.. (that's the reason I followed them in the first place, right?)..And then do a query like:

db.Users.find({ user_id : { $in : followees })

This will tell who all are following me (say my id is 'user_id').

Another reason why I don't suggest the other way round is that.. one may follow at the most 30-40 people, so User document storing 30-40 followees should be okay as against a User document storing thousands of followers! With the followee-in-document approach, you get an roughly even sized User documents throughout..In the follower-in-document approach, you will have some very small but some very bulky documents as well. And depending upon the amount of follower-data you put in (if any, apart from follower_id), you might want to be careful about the document size limit.

141

answered Sep 20 '22 00:09

Aafreen Sheikh

Given that its a many to many relationship, option (2) look good to me. As for the matching deletions, its usually not an issue, as long as you have some sort of reconciliation mechanism between the two documents.

Fragmentation generally depends on the application's access patterns and is generally an issue with most data systems. Some significant changes have been made to mongo to avoid internal fragmentation. Further, there are offline compaction alternatives to fix fragmentation, if it happens.

answered Sep 21 '22 00:09

Sid

Related questions
                            
                                Storing data url in Mongo DB
                            
                                MongoDB Aggregation Framework - How to Match by Date Range, Group By Days, and Return Average For Each Day?
                            
                                how to get distinct values in mongodb using golang
                            
                                How to get node.js to connect to mongolab using mongoose
                            
                                How should I define interfaces of documents when using Typescript and Mongodb?
                            
                                How to enable auditing and log all CRUD operations in MongoDB Node App?
                            
                                Auto populate date in MongoDB on insert
                            
                                Mongodb get the 3-byte counter from an ObjectId
                            
                                From MongoDB to Google Cloud Datastore
                            
                                A lot of WriteConflict errors with MongoDB transactions
                            
                                Set field in mongoose document to array length
                            
                                MongoDB Catalina: connection attempt failed: SocketException: Error connecting to 127.0.0.1:27017
                            
                                What NoSQL solution is best to store Apache error_log and access_log? Cassandra or MongoDB?
                            
                                Do you need Solr/Lucene for MongoDB, CouchDB and Cassandra?
                            
                                Should a MongoDB arbiter be included in the client connection configuration?
                            
                                How to convert data to utf-8 in node.js?
                            
                                What happens when mongodb is out of memory?
                            
                                mongo using subcollection
                            
                                Why is the database of a Meteor app that has been run once (and never loaded) taking up nearly 3GB?
                            
                                MongoDB shell: reading a line from the console

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is a good MongoDB document structure for most efficient querying of user followers/followees?

Tags:

mongodb

documents

tdous

People also ask

2 Answers

Aafreen Sheikh

Sid

Recent Activity

Donate For Us