How to handle changes in duplicated data in NoSQL

Question

We're evaluating NoSQL for an upcoming project. I tend to think of things in a RDBMS way and am having trouble conceptualizing the lack of normalization.

I understand that duplicating data is not considered wrong in NoSQL. What I'm having trouble understanding is fixing changes to data to prevent anomalies.

Explanation of Question by Example:

You are organizing a series of poker tournaments. You have players, locations, and tournament events. As I understand it, a tournament event might contain a location and a collection of players. It does not need to have all the player data, but if you want to get the names and home addresses of everyone going to the next tournament, that info should be in the tournament collection.

Someone has gotten married and moved, changing their last name and address. Does the application need to update the player collection and the tournament collection? Or is my model of the collections wrong? How do developers "keep track" of where information is duplicated?

Chris Shain · Accepted Answer

The model that I see being used quite a bit lately is to have an immutable "master" collection of data (in your case, the list of players, the list of tournaments with the players in each tournament modeled "relationally", where the tournament record has a list of player ids), and a denormalized list (in your case, a list of tournaments with the fully-populated player data) that is only ever updated by running a periodic process over the "master" data.

This way the application only needs to update the master data, and the periodic update process will eventually rebuild the denormalized result.

How to handle changes in duplicated data in NoSQL

Tags:

nosql

denormalization

justkevin

1 Answers

Chris Shain

Recent Activity

Donate For Us

How to handle changes in duplicated data in NoSQL

Tags:

nosql

denormalization

justkevin

1 Answers

Chris Shain

Related questions

Recent Activity

Donate For Us