I'd like to convert an existing table into the 1st normal form (simplest normalization possible; see the example).
Do you happen to know what is the T-SQL is for this sort of problem? Many thanks!
Update
Tried the answer below, it worked perfectly. Here is the steps I used to test the answer:
In the case above @Thomas has a perfectly workable solution. However, sometimes people simplify for the sake of asking the question, so I'll address what you might want to do if you need to to go to many tables (or the first table has no unique constraint on name for instance), not just two.
First, I will insert the data to a staging table and add a column for the id which will be null. Then I would write an insert to the parent table using the OUTPUT clause to output the ids and the natural key to a table variable. Then I would use the table variable to update the id field in the staging table. Then I would insert records from the staging table to the other tables. Since I now have the id there, it is no longer necessary to access the orginal parent table. (If the number of records are large, I might also index the staging table).
Now if you have no natural key, the process becomes harder because you have no way to identify which record goes to who. Then I usually add an identity to the staging table and then do the intial import to the parent table one record at a time (including the stagingtableid as a variable in the cursor) and then update the staging table with each Parent table id as it is created. Once all the intial records are updated, I use set-based processes to insert or update to the other tables.
The staging table also gives you the chance to fix any bad data locally before trying to put it into your production tables.
Other syntax you may need to know if things are complex or if this is a repeated process is the MERGE statment. This will insert if it is a new record and update if it is an existing record.
If this is a very complex transformation, you may consider using SSIS.
Starting with the Customer table
INSERT INTO Customer (Name)
SELECT DISTINCT Name
FROM Flat_CSV_File
If you have repeated imports
INSERT INTO Customer (Name)
SELECT DISTINCT f.Name
FROM Flat_CSV_File f
LEFT OUTER JOIN Customer c ON f.Name = c.Name
WHERE c.Id IS NULL
Orders (your table name Order is a reserved word in TSQL, so you need to quote it with square brackets)
INSERT INTO [Order] (CustomerId, Description, Cost)
SELECT c.Id, f.Description, f.Cost
FROM Flat_CSV_File f
INNER JOIN Customer c ON f.Name = c.Name
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With