Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What pattern is used for storing and retrieving record audit trails in Mongo?

Tags:

mongodb

I am looking into using Mongo to store my data. I am wanting to store one document per change to a record. For example, a record represents a calendar event. Each time this event is updated (via a web form), I want to store the new version in a new document. This will allow historical details of this event to be retrieved upon request.

If I were to store this data using a relational database, I would have an 'events' table and an 'events_history' table:

'events' table:
event_id
last_event_history_id

'events_history' table:
event_history_id
event_id
event_date

So, when I want to retrieve a list of events (showing the latest history for each event), I would do this:

SELECT * FROM events_history eh, events e 
WHERE 
eh.events_history_id = e.last_event_history_id

However, I am unsure about how to approach storing the data and generating this list if using Mongo?

like image 676
JoeTidee Avatar asked Mar 13 '23 18:03

JoeTidee


1 Answers

Joe,

Your question is a frequent question for folks coming from an RDBMS background to MongoDB (which is BTW exactly how I personally came to MongoDB)

I can relate to your question.

If I were to restate your question in a generic way, I would say:

How to model one-to-many relationships in MongoDB?

There are basically two approaches:

  1. Embedded Documents

You can have a "events" collection. The documents in this collection can contain a "key" called "Event_history where each entry is an "old version" of the event itself.

The discussion on embedded documents for MongoDB is here.

  1. Document References

This is very similar to what you do in relational databases. You can have two collections, each with its own documents.One collection for "active" events and one collections for historical events

The discussion for Document references in MongoDB is here.

Now back to your question: Which one of these 2 approaches is better.

There are a couple of factors to consider

1 - MongoDB does not currently have database based joins - If your workload is primarily reads, and your documents/events do not change frequently The approach with embedded documents will be easier and have better performance.

2 - Avoid Growing Documents. If your events change frequently causing MongoDB documents to grow, then you should opt for design #2 with the references. "Document growth" at scale with MongoDB is usually not the best performance option. An in-depth discussion of why document growth should be avoided is here.

Without knowing details about your app, I am inclined to "guess" document references would be better for an event management system where history is an important feature. Have 2 separate collections, and perform the join inside of your app.

like image 178
BigDataKid Avatar answered Mar 15 '23 15:03

BigDataKid