How should I approach migrating data from a "bad" database design to a usable design?

Tags:

The current project I inherited mainly revolves around one unnormalized table. There are some attempts at normalization but the necessary constraints weren't put in place.

Example: In the Project table, there is a client name (among other values) and there is also a clients table which just contains client names [no keys anywhere]. The clients table is just used as a pool of values to offer the user when adding a new project. There isn't a primary key on the clients table or a foreign key.

"Design patterns" such as this is common through the current state of the database and in the applications that use it. The tools I have my disposal are SQL Server 2005, SQL Server Management Studio, and Visual Studio 2008. My initial approach has been to manually determine which information needs normalization and running Select INTO queries. Is there a better approach than a case by case or anyway this could be automated?

Edit: Also, I've discovered that a "work order number" isn't an IDENTITY (autonumber, unique) field and they are generated sequentially and are unique to each work order. There are also some gaps in the existing numbering but all are unique. Is the best approach for this writing a store procedure to generate dummy rows before migrating?

650

asked Mar 21 '09 20:03

llamaoo7

2 Answers

The best approach to migrating to a usable design? CAREFULLY

Unless you're willing to break (and fix) every application that currently uses the database, your options are limited, because you can't change the existing structure very much.

Before you begin, think carefully about your motivations - if you have an existing issue (a bug to fix, an enhancement to make) then go ahead slowly. However, it's rarely worthwhile to monkey around with a working production system just to achieve an improvement that nonone else will ever notice. Note that this can play into your favour - if there's an existing issue, you can point out to management that the most cost-effective way to fix things is to alter the database structure in this way. This means you have management support for the changes - and (hopefully) their backup if something turns pear shaped.

Some practical thoughts ...

Make one change at a time ... and only one change. Make sure each change is correct before you move on. The old proverb of "measure twice, cut once" is relevant.

Automate Automate Automate ... Never ever make the changes to the production system "live" using SQL Server Management Studio. Write SQL scripts that perform the entire change in one go; develop and test these against a copy of the database to make sure you get them right. Don't use production as your test server - you might accidentally run the script against production; use a dedicated test server (if the database size is under 4G, use SQL Server Express running on your own box).

Backups ... the first step in any script should be to backup the database, so that you've got a way back if something does go wrong.

Documentation ... if someone comes to you in twelve months, asking why feature X of their application is broken, you'll need a history of the exact changes made to the database to help diagnosis and repair. First good step is to keep all your change scripts.

Keys ... it's usually a good idea to keep the primary and foreign keys abstract, within the database and not revealed through the application. Things that look like keys at a business level (like your work order number) have a disturbing habit of having exceptions. Introduce your keys as additional columns with appropriate constraints, but don't change the definitions of existing ones.

Good luck!

131

answered Oct 15 '22 00:10

Bevan

Create the new database the way you think it should be structured.
Create an importError table in the new database with columns like "oldId" and "errorDesc"
Write a straightforward, procedural, legible script that attempts to select a row from the old structure and insert it into the new structure. If an insert fails, log as specific an error as possible to the importError table (specifically, why the insert failed).
Run the script.
Validate the new data. Check whether there are errors logged to the importError table. If the data is invalid or there are errors, refactor your script and run it again, possibly modifying your new database structure where necessary.
Repeat steps 1-5 until you have a solid conversion script.

The result of this process will be that you have: a) a new db structure that is validated against the old structure and tested against "pragmatism"; b) a log of potential issues you may need to code against (such as errors that you can't fix through your conversion because they require a concession in your schema that you don't want)

(I might note that it's helpful to write the script in your scripting/programming language of choice, rather than in, say, SQL.)

answered Oct 14 '22 23:10

Rahul

Related questions
                            
                                Query performance with concatenation and LIKE
                            
                                Cant drop database nor create database
                            
                                Is there a difference between NOT (ColumnName LIKE '%a%') and ColumnName NOT LIKE '%a%'
                            
                                Rails complex query to count unique records based on truth table
                            
                                Comparison with trailing spaces in MySQL
                            
                                HIVE select count(*) non null returns higher value than select count(*)
                            
                                How to apply WITH (NOLOCK) to an entire query
                            
                                Laravel 5 update all Pivot entries
                            
                                What is the analogue of EXCEPT clause in SQL in Pandas?
                            
                                Golang: Ping succeed the second time even if database is down
                            
                                SQL convert nvarchar to float
                            
                                Show SQL result in horizontal format
                            
                                SQL Multiple Updates vs single Update performance
                            
                                Does a drop table also drop the constraints?
                            
                                How to escape in Sequelize?
                            
                                Accumulating previous rows with grouping
                            
                                SQL- How to convert a YYYYMM number to a date
                            
                                STIntersection result is STIntersects = 0
                            
                                How do I show data in the header of a SQL 2005 Reporting Services report?
                            
                                Auto Generate Sort Orders with SQL UPDATE

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How should I approach migrating data from a "bad" database design to a usable design?

Tags:

sql

sql-server

refactoring

rdbms

normalization

llamaoo7

People also ask

2 Answers

Bevan

Rahul

Recent Activity

Donate For Us