I am trying to build a schema for chat application in mongodb
. I have 2 types of user models - Producer
and Consumer
. Producer and Consumer can have conversations with each other. My ultimate goal is to fetch all the conversations for any producer and consumer and show them in a list, just like all the messaging apps (eg Facebook) do.
Here is the schema
I have come up with:
Producer: {
_id: 123,
'name': "Sam"
}
Consumer:{
_id: 456,
name: "Mark"
}
Conversation: {
_id: 321,
producerId: 123,
consumerId: 456,
lastMessageId: 1111,
lastMessageDate: 7/7/2018
}
Message: {
_id: 1111,
conversationId: 321,
body: 'Hi'
}
Now I want to fetch say all the consersations of Sam. I want to show them in a list just like Facebook does, grouping them with each Consumer and sorting according to time. I think I need to do following queries for this:
1) Get all Conversations where producerId is 123
sorted by lastMessageDate.
I can then show the list of all Conversations.
2) If I want to know all the messages in a conversation, I make query on Message and get all messages where conversationId is 321
Now, here for each new message
, I also need to update the conversation
with new messageId and date everytime. Is this the right way to proceed and is this optimal considering the number of queries involved. Is there a better way I can proceed with this? Any help would be highly appreciated.
Design: I wouldn't say it's bad. Depending on the case you've described, it's actually pretty good. Such denormalization of last message date and ID is great, especially if you plan a view with a list of all conversations - you have the last message date in the same query. Maybe go even one step further and add last message text, if it's applicable in this view.
You can read more on pros and cons of denormalization (and schema modeling in general) on the MongoDB blog (parts 1, 2 and 3). It's not that fresh but not outdated.
Also, if such multi-document updates might scary you with some possible inconsistencies, MongoDB v4 got you covered with transactions.
Querying: On one hand, you can involve multiple queries, and it's not bad at all (especially, when few of them are easily cachable, like the producer or consumer data). On the other hand, you can use aggregations to fetch all these things at once if needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With