i am going to design a group chat application based on mongodb, there are two schema design choices, one is designed as one document for one group chat message, another is designed as one document for all group messages.
In the first option, it can be shown as
var ChatMessageSchema = new Schema({
fromUserId: ObjectId,
toTroupeId: ObjectId,
text: String,
sent: Date
}
in the second option, it can be shown as
var ChatMessageSchema = new Schema({
toTroupeId: ObjectId,
chats:[
fromUserId: ObjectId,
text: String,
sent: Date
]
}
Both design has pros and cons, the drawback of the second option is it can hardly index on the user and search the messages from users, and also too many group message might force to create more then one documents.
The first option seems to be more reasonable since it can allow to search the message based on groupid or userid if we can index properly.
but I wonder as there are hundreds of thousands messages in the group, meaning there will be corresponding hundreds of thousands documents in one group, does this will affect the database performance?
any idea on these design choices, is the first option as the optimal one, or how to optimise it?
I think it's hard to say that one database or another is the BEST without understanding more about the application but rest assured that MongoDB has been the choice for many popular chat applications.
select() is a method of Mongoose that is used to select document fields that are to be returned in the query result. It is used to include or exclude document fields that are returned from a Mongoose query. The select() method performs what is called query projection.
When working with NodeJS, we can use mongoose ODM to define a schema for a MongoDB collection. A mongoose schema defines the shape of documents inside a particular collection. In this article, we will discuss how to create a schema in a mongoose with the help of an example.
A Mongoose schema defines the structure of the document, default values, validators, etc., whereas a Mongoose model provides an interface to the database for creating, querying, updating, deleting records, etc.
I would suggest a third option; creating a new collection for every group, e.g. room_$groupid
. In such a collection, you could insert every message separately. This would give you the benefit of getting a full chatroom without a filter. You could simply return the last 200 or so messages from the collection.
It would allow for easier scalability, cause you won't end up with a single massive collection that you would have to filter through.
However, you would have to write the logic for selecting the right collection but should be a fairly trivial task. The downside would be that it would be near impossible to do a text search over multiple groups without throwing performance out of the window.
Collection limit*
MongoDB is made to handle huge amounts of data and their PDF Performance Best Practices for MongoDB states:
Avoir large documents
Which is also made clear by the 16MB limit.
So one can argue that MongoDB is specifically designed to handle hundreds of thousand of documents corresponding to your first schema.
Simply reduce the number of indexes to what you need (do you really need to query by users that often or could accept that query to be a lot slower ?) and you should be fine with your first schema. Actually I'm not sure there is any benefit with the second one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With