Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do implement schema changes in a NOSQL storage system

How do you manage a major schema change when you are using a Nosql store like SimpleDB?

I know that I am still thinking in SQL terms, but after working with SimpleDB for a few weeks I need to make a change to a running database. I would like to change one of the object classes to have a unique id, as rather than a business name, and as it is referenced by another object, I will need to also update the reference value in these objects.

With a SQL database you would run set of sql statements as part of the client software deployment process. Obviously this will not work with something like SimpleDB as

  • there is no equivalent of a SQL update statement.
  • Due to the distributed nature of SimpleDB, there is no way of knowing when the changes you have made to the database have 'filtered' out to all the nodes running your client software.

Some solutions I have thought of are

  • Each domain has a version number. The client software knows which version of the domain it should use. Write some code that copies the data from one domain version to another, making any required changes as you go. You can then install new client software that then accesses the new domain version. This approach will not work unless you can 'freeze' all write access during the update process.

  • Each item has a version attribute that indicates the format used when it was stored. The client uses this attribute when loading the object into memory. Object can then be converted to the latest format when it is written back to SimpleDB. The problem with this is that the new software needs to be deployed to all servers before any writes in the new format occur, or clients running the old software will not know how to read the new format.

It all is rather complex and I am wondering if I am missing something?

Thanks

Richard

like image 767
richard Avatar asked Aug 30 '11 05:08

richard


People also ask

How do I change the schema of a database?

To change the schema of a table by using SQL Server Management Studio, in Object Explorer, right-click on the table and then click Design. Press F4 to open the Properties window. In the Schema box, select a new schema. ALTER SCHEMA uses a schema level lock.

Does NoSQL database have schema?

Does NoSQL have a schema? NoSQL databases do not have a schema in the same rigid way that relational databases have a schema. Each of the four main types of NoSQL database has an underlying structure that is used to store the data.

Which type of schema will be used for a NoSQL database like MongoDB?

NoSQL database featuresFlexible schemas. Horizontal scaling. Fast queries due to the data model.


1 Answers

I use something similar to your second option, but without the version attribute.

First, try to keep your changes to things that are easy to make backward compatible - changing the primary key is the worst case scenario for this.

Removing a field is easy - just stop writing to that field once all servers are running a version that doesn't require it.

Adding a field requires that you never write that object using code that won't save that field. If you can't deploy the new version everywhere at once, use an intermediate version that supports saving the field before you deploy a version that requires it.

Changing a field is just a combination of these two operations.

With this approach changes are applied as needed - write using the new version, but allow reading of the old version with default or derived values for the new field.

You can use the same code to update all records at once, though this may not be appropriate on a large dataset.

Changing the primary key can be handled the same way, but could get really complex depending on which nosql system you are using. You are probably stuck with designing custom migration code in this case.

like image 179
Tom Clarkson Avatar answered Oct 22 '22 08:10

Tom Clarkson