Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to embed documents in Mongo DB

Tags:

mongodb

I'm trying to figure out how to best design Mongo DB schemas. The Mongo DB documentation recommends relying heavily on embedded documents for improved querying, but I'm wondering if my use case actually justifies referenced documents.

A very basic version of my current schema is basically: (Apologies for the psuedo-format, I'm not sure how to express Mongo schemas)

users {
  email (string)
}

games {
  user (reference user document)
  date_started (timestamp)
  date_finished (timestamp)
  mode (string)
  score: {
    total_points (integer)
    time_elapsed (integer)
  }
}

Games are short (about 60 seconds long) and I expect a lot of concurrent writes to be taking place.

At some point, I'm going to want to calculate a high score list, and possibly in a segregated fashion (e.g., high score list for a particular game.mode or date)

Is embedded documents the best approach here? Or is this truly a problem that relations solves better? How would these use cases best be solved in Mongo DB?

like image 655
Andy Baird Avatar asked Feb 22 '11 04:02

Andy Baird


2 Answers

... is this truly a problem that relations solves better?

The key here is less about "is this a relation?" and more about "how am I going to access this?"

MongoDB is not "anti-reference". MongoDB does not have the benefits of joins, but it does have the benefit of embedded documents.

As long as you understand these trade-offs then it's perfectly fair to use references in MongoDB. It's really about how you plan to query these objects.

Is embedded documents the best approach here?

Maybe. Some things to consider.

  • Do games have value outside of the context of the user?
  • How many games will a single user have?
  • Is games transactional in nature?
  • How are you going to access games? Do you always need all of a user's games?

If you're planning to build leaderboards and a user can generate hundreds of game documents, then it's probably fair to have games in their own collection. Storing ten thousand instances of "game" inside of each users isn't particularly useful.

But depending on your answers to the above, you could really go either way. As the litmus test, I would try running some Map / Reduce jobs (i.e. build a simple leaderboard) to see how you feel about the structure of your data.

like image 138
Gates VP Avatar answered Sep 22 '22 12:09

Gates VP


Why would you use a relation here? If the 'email' is the only user property than denormalization and using an embedded document would be perfectly fine. If the user object contains other information I would go for a reference.

like image 34
Andreas Jung Avatar answered Sep 18 '22 12:09

Andreas Jung