Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge two identical database data to one?

Two customers are going to merge. They are both using my application, with their own database. About a few weeks they are merging (they become one organisation). So they want to have all the data in 1 database.

So the two database structures are identical. The problem is with the data. For example, I have Table Locations and persons (these are just two tables of 50):

Database 1:

Locations:

Id    Name         Adress   etc....
1     Location 1
2     Location 2

Persons:

Id    LocationId     Name     etc...
1     1              Alex
2     1              Peter
3     2              Lisa

Database 2:

Locations:

Id    Name         Adress   etc....
1     Location A
2     Location B

Persons:

Id    LocationId     Name     etc...
1     1              Mark
2     2              Ashley
3     1              Ben

We see that person is related to location (column locationId). Note that I have more tables that is referring to the location table and persons table.

The databases contains their own locations and persons, but the Id's can be the same. In case, when I want to import everything to DB2 then the locations of DB1 should be inserted to DB2 with the ids 3 and 4. The the persons from DB1 should have new Id 4,5,6 and the locations in the person table also has to be changed to the ids 4,5,6.

My solution for this problem is to write a query which handle everything, but I don't know where to begin.

What is the best way (in a query) to renumber the Id fields also having a cascade to the childs? The databases does not containing referential integrity and foreign keys (foreign keys are NOT defined in the database). Creating FKeys and Cascading is not an option.

I'm using sql server 2005.

like image 761
Amedo Avatar asked Nov 04 '22 04:11

Amedo


1 Answers

You say that both customers are using your application, so I assume that it's some kind of "shrink-wrap" software that is used by more customers than just these two, correct?

If yes, adding special columns to the tables or anything like this probably will cause pain in the future, because you either would have to maintain a special version for these two customers that can deal with the additional columns. Or you would have to introduce these columns to your main codebase, which means that all your other customers would get them as well.

I can think of an easier way to do this without changing any of your tables or adding any columns.
In order for this to work, you need to find out the largest ID that exists in both databases together (no matter in which table or in which database it is).

This may require some copy & paste to get a lot of queries that look like this:

select max(id) as maxlocationid from locations
select max(id) as maxpersonid from persons
-- and so on... (one query for each table)

When you find the largest ID after running the query in both databases, take a number that's larger than that ID, and add it to all IDs in all tables in the second database.
It's very important that the number needs to be larger than the largest ID that already exists in both databases!

It's a bit difficult to explain, so here's an example:

Let's say that the largest ID in any table in both databases is 8000.
Then you run some SQL that adds 10000 to every ID in every table in the second database:

update Locations set Id = Id + 10000
update Persons set Id = Id + 10000, LocationId = LocationId + 10000
-- and so on, for each table

The queries are relatively simple, but this is the most work because you have to build a query like this manually for each table in the database, with the correct names of all the ID columns.

After running the query on the second database, the example data from your question will look like this:

Database 1: (exactly like before)

Locations:

Id    Name         Adress   etc....
1     Location 1
2     Location 2

Persons:

Id    LocationId     Name     etc...
1     1              Alex
2     1              Peter
3     2              Lisa

Database 2:

Locations:

Id    Name         Adress   etc....
10001 Location A
10002 Location B

Persons:

Id    LocationId     Name     etc...
10001 10001          Mark
10002 10002          Ashley
10003 10001          Ben

And that's it! Now you can import the data from one database into the other, without getting any primary key violations at all.

like image 117
Christian Specht Avatar answered Nov 08 '22 15:11

Christian Specht