Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NoSQL normalizing help

Tags:

mongodb

nosql

I am just starting out with NoSQL databases and have been reading material to try and wrap my head around how to think in NoSQL terms. The best way for me to do this is to play around.

With that being said I started thinking about how I could implement a traditional relational model in NoSQL and would like some help and input from those who know NoSQL databases well.

Say I want the following relationships:

Owner 1 - M PC

PC 1 - M Parts

In this traditional relational schema we have an Owner of a PC and then each PC can be made up of many parts. This means that we generally would have the following tables:

Parts

PC

PCParts

Owner

I have a couple of questions about this.

  1. How would an experienced NoSQL developer create the data model for this?
  2. Would the PC just contain an array of Part keys? Or have I missed the point?

Any information on this would be welcomed.

like image 260
Tim Avatar asked Aug 31 '11 14:08

Tim


1 Answers

One solution would be to store an Owner document with one field consisting of a list of objectid references to PC documents. Likewise, PC documents would include a list of objectid references to Parts documents.

This is the way MongoDB simulates relationships. Think of an SQL foreign key, which resides in the child and references the parent, and reverse the direction of reference: MongoDB stores in a parent document a list of objectid's for its children.

But this is not normalization -- it's denormalization. It's like storing a comma-separated list of id's in an RDBMS, which would be a repeating group that breaks First Normal Form.

You might reasonably wonder how you would find out which PC's contain a given part, if the references are stored in the PC documents. For that, you'd have to store a redundant list of reference to PC's in the Part document, and then worry about how you're going to keep the bidirectional references in sync, risking anomalies where a PC thinks it uses a Part, but that respective Part has no reference to the PC (or vice versa).

You could create a MongoDB document mimicking an SQL many-to-many intersection table, where one document contains exactly one objectid reference to a PC and one reference to a Part. Then create many such documents, as you would create many rows in an intersection table in SQL. But because these are documents, not rows, there's no schema to enforce that all documents store only the one reference for each entity. And there's no such thing as a JOIN to do the lookup efficiently.

These are consequences of denormalization and document-oriented databases, and why relational databases still offer some advantages.

like image 92
Bill Karwin Avatar answered Nov 26 '22 14:11

Bill Karwin