I'm new to mongodb, maybe this is a trivial question. I have two mongodb collections: user
and post
. A user can create and follow multiple posts, and posts are listed sorted by last modification date. There may be a very large number of users following a specific post, so I don't want to keep the list of followers in each post document. On the other hand, one user will probably not follow more than a few thousand posts, so I decided to keep the list of followed posts' objectids in each user document.
In order to be able to quickly list the 50 most recently modified posts for a given user, I chose to keep the last_updated_at
field along with the post objectid.
The post
document is fairly basic:
{
"_id" : ObjectId("5163deebe4d809d55d27e847"),
"title" : "All about music"
"comments": [...]
...
}
The user
document looks like this:
{
"_id": ObjectId("5163deebe4d809d55d27e846"),
"posts": [{
"post": ObjectId("5163deebe4d809d55d27e847"),
"last_updated_at": ISODate("2013-04-09T11:27:07.184Z")
}, {
"post": ObjectId("5163deebe4d809d55d27e847"),
"last_updated_at": ISODate("2013-04-09T11:27:07.187Z")
}]
...
}
When a user creates or follows a post, I can simply $push
the post's ObjectId
and last_updated_at
to the end of the posts
list in the user's document. When a post is modified (for example when a comment is added to the post), I update the last_updated_at
field for that post in all the follower's user documents. That's pretty heavy, but I don't know how to avoid it.
When I want to get the list of 50 most recently updated posts for a user, I unfortunately need to get the whole list of followed posts, then sort by last_updated_at
in memory, then keep only the first 50 posts.
So I tried to change the implementation to reorder the list when a post is modified: I $push
it to the end of the list, and $pull
it from wherever it is. Since this is a two step procedure, there's a race condition where I might get twice the same post in the list. Is there no better way to maintain a sorted array in mongodb?
Since you may have frequent updates to the latest posts for a given user, you probably want to avoid the overhead of rewriting data unnecessarily to maintain a sorted array.
A better approach to consider would be to flatten the data model and use a separate collection instead of an ordered array:
(userID, postID, lastUpdated)
update()
with the multi:true
and upsert:true
options and $set
the last_updated_at to the new value.find()
with sort and limit options.If you do want to maintain ordered arrays, MongoDB 2.4 added two helpful features related to this use case:
So you can achieve your outcome of pushing to a fixed-sized array of 50 items sorted by last updated date descending:
db.user.update(
// Criteria
{ _id: ObjectId("5163deebe4d809d55d27e846") },
// Update
{ $push: {
posts: {
// Push one or more updates onto the posts array
$each: [
{
"post": ObjectId("5163deebe4d809d55d27e847"),
"last_updated_at": ISODate()
}
],
// Slice to max of 50 items
$slice:-50,
// Sorted by last_updated_at desc
$sort: {'last_updated_at': -1}
}
}}
)
The $push
will update the list in sorted order, with the $slice
trimming the list to the first 50 items. Since the posts aren't unique you'll still need to $pull
the original from the list first, eg:
db.user.update(
// Criteria
{ _id: ObjectId("5163deebe4d809d55d27e846") },
// Update
{
$pull: {
posts: { post: ObjectId("5163deebe4d809d55d27e847") }
}
}
)
A benefit of this approach is that array manipulation is being done on the server, but as with sorting the array in your application you may still be updating the document more than is required.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With