Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should I approach migrating data from a "bad" database design to a usable design?

The current project I inherited mainly revolves around one unnormalized table. There are some attempts at normalization but the necessary constraints weren't put in place.

Example: In the Project table, there is a client name (among other values) and there is also a clients table which just contains client names [no keys anywhere]. The clients table is just used as a pool of values to offer the user when adding a new project. There isn't a primary key on the clients table or a foreign key.

"Design patterns" such as this is common through the current state of the database and in the applications that use it. The tools I have my disposal are SQL Server 2005, SQL Server Management Studio, and Visual Studio 2008. My initial approach has been to manually determine which information needs normalization and running Select INTO queries. Is there a better approach than a case by case or anyway this could be automated?

Edit: Also, I've discovered that a "work order number" isn't an IDENTITY (autonumber, unique) field and they are generated sequentially and are unique to each work order. There are also some gaps in the existing numbering but all are unique. Is the best approach for this writing a store procedure to generate dummy rows before migrating?

like image 650
llamaoo7 Avatar asked Mar 21 '09 20:03

llamaoo7


People also ask

What are the two main approaches to creating a new user database?

There are two approaches for developing any database, the top-down method and the bottom-up method. While these approaches appear radically different, they share the common goal of uniting a system by describing all of the interaction between the processes.


2 Answers

The best approach to migrating to a usable design? CAREFULLY

Unless you're willing to break (and fix) every application that currently uses the database, your options are limited, because you can't change the existing structure very much.

Before you begin, think carefully about your motivations - if you have an existing issue (a bug to fix, an enhancement to make) then go ahead slowly. However, it's rarely worthwhile to monkey around with a working production system just to achieve an improvement that nonone else will ever notice. Note that this can play into your favour - if there's an existing issue, you can point out to management that the most cost-effective way to fix things is to alter the database structure in this way. This means you have management support for the changes - and (hopefully) their backup if something turns pear shaped.

Some practical thoughts ...

Make one change at a time ... and only one change. Make sure each change is correct before you move on. The old proverb of "measure twice, cut once" is relevant.

Automate Automate Automate ... Never ever make the changes to the production system "live" using SQL Server Management Studio. Write SQL scripts that perform the entire change in one go; develop and test these against a copy of the database to make sure you get them right. Don't use production as your test server - you might accidentally run the script against production; use a dedicated test server (if the database size is under 4G, use SQL Server Express running on your own box).

Backups ... the first step in any script should be to backup the database, so that you've got a way back if something does go wrong.

Documentation ... if someone comes to you in twelve months, asking why feature X of their application is broken, you'll need a history of the exact changes made to the database to help diagnosis and repair. First good step is to keep all your change scripts.

Keys ... it's usually a good idea to keep the primary and foreign keys abstract, within the database and not revealed through the application. Things that look like keys at a business level (like your work order number) have a disturbing habit of having exceptions. Introduce your keys as additional columns with appropriate constraints, but don't change the definitions of existing ones.

Good luck!

like image 131
Bevan Avatar answered Oct 15 '22 00:10

Bevan


  1. Create the new database the way you think it should be structured.
  2. Create an importError table in the new database with columns like "oldId" and "errorDesc"
  3. Write a straightforward, procedural, legible script that attempts to select a row from the old structure and insert it into the new structure. If an insert fails, log as specific an error as possible to the importError table (specifically, why the insert failed).
  4. Run the script.
  5. Validate the new data. Check whether there are errors logged to the importError table. If the data is invalid or there are errors, refactor your script and run it again, possibly modifying your new database structure where necessary.
  6. Repeat steps 1-5 until you have a solid conversion script.

The result of this process will be that you have: a) a new db structure that is validated against the old structure and tested against "pragmatism"; b) a log of potential issues you may need to code against (such as errors that you can't fix through your conversion because they require a concession in your schema that you don't want)

(I might note that it's helpful to write the script in your scripting/programming language of choice, rather than in, say, SQL.)

like image 35
Rahul Avatar answered Oct 14 '22 23:10

Rahul