Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

noSQL and normalised data

I am thinking about starting my first CouchDB project and coming from an ORM background I am concerned how to create my documents that may be difficult to maintain.

For example, if I have the following model :

A *--->(1)B

which means for every A object there is a B object and there are many instances of A that can share a B object. In this case there are pointers/foreign key in A to B.

I could create a document that contains all the A data and the B data. However, the issue I have is if at a later stage (after 10000s of documents are created), I may need to change some data which means I have to update all my documents.

In an ORM/normalised database world I would simple update B and all my references are now up to database.

How do I handle this in CouchDB or is the NoSQL approach not suited for these types of situations?

JD

like image 728
JD. Avatar asked Dec 02 '11 14:12

JD.


People also ask

Is data in NoSQL normalized?

First of all, it is important to understand that NoSQL doesn't follow the same principles as Relational Databases such as fixed schemas, normalization, support for expressive queries like SQL.

Is NoSQL normalized or denormalized?

Denormalization is the “community approved standard” to deal with related data in NoSQL systems.

Does MongoDB require normalization?

Normalizing your data like you would with a relational database is usually not a good idea in MongoDB. Normalization in relational databases is only feasible under the premise that JOINs between tables are relatively cheap.

Should I normalize my data before storing it in MongoDB?

Using MongoDB removes the need to normalize data to accommodate the database. Instead, you can store the same objects you are using in the code.


1 Answers

The general answer to this question: There is no general answer to this question.

The point is that in NoSQL, the data structure is not dictated by the data, but rather by the queries the data structure must support. So, rather than using the same pattern for each and every instance of a 1:N or M:N association problem, the NoSQL way is to use different patterns depending on your specific needs. These could be, for instance:

  • Write/Read Ratio
  • Specific database features that make embedding easier or harder
  • The types of queries you need to support
  • Performance considerations on how the data can be indexed, sharded, federated or in any other way split or cached

Generally, my feeling is that beginners tend to 'over-embed', but I can speak for MongoDB only. Embedding is a powerful feature, but embedded objects are not 'first-class citizens', so it shouldn't be used as a replacement for every 1:n relation. Only for some :)

like image 92
mnemosyn Avatar answered Sep 22 '22 00:09

mnemosyn