I have a basic db schema comprising 2 tables; One is a simple ID -> Text list of terms, and the other has 2 columns, parent and child. The ids in the first table are generated on insert by a db sequence while the second table contains a mapping between keys to store the 'structure' of the hierarchy.
My problem is that I may want to sometimes move a tree from one db to another. If I have 2 DBs, each with 10 terms in (Database A's terms != Database B's terms, and there's no overlap), and I just copy the data from A to B then I'll get an obvious problem that the terms will be renumbered but the relationships wont. Clearly in this example just adding 10 to all the relationship keys will work, but does anyone know of a general algorithm to do this?
The DB is oracle 11g, and an oracle specific solution is fine...
Quick answer
Import into a staging table, but populate mapped ID values from the same sequence used to produce ID values from the destination table. This is guaranteed to avoid conflicts between ID values as DBMS engine supports concurrent access to sequences.
With the ID values on the node mapped (see below) re-mapping the ID values for the edges is trivial.
Longer answer
You will need a mechanism that maps the values between the old keys from the source and new keys in the destination. The way to do this is to create intermediate staging tables that hold the mappings between the old and new kays.
In Oracle, autoincrementing keys are usually done with sequences in much the way you've described. You need to construct staging tables with a placeholder for the 'old' key so you can do the re-mapping. Use the same sequence as used by the application to populate the ID values on actual destination database tables. The DBMS allows concurrent accesses to sequences and using the same sequence guarantees that you will not get collisions in the mapped ID values.
If you have a schema like:
create table STAGE_NODE (
ID int
,STAGED_ID int
)
/
create table STAGE_EDGE (
FROM_ID int
,TO_ID int
,OLD_FROM_ID int
,OLD_TO_ID int
)
/
This will allow you to import into the STAGE_NODE
table, preserving the imported key values. The insert process puts the original ID from the imported table into STAGED_ID and populates ID from the sequence.
Make sure you use the same sequence that's used for populating the ID column in the destination table. This ensures that you won't get key collisions when you go to insert to the final destination table. It is important to re-use the same sequence.
As a useful side effect this will also allow the import to run while other operations are taking place on the table; concurrent reads on a single sequence are fine. If necessary you can run this type of import process without bringing down the applciation.
Once you have this mapping in the staging table, ID values in the EDGE table are trivial to compute with a query like:
select node1.ID as FROM_ID
,node2.ID as TO_ID
from STAGE_EDGE se
join STAGE_NODE node1
on node1.STAGED_ID = se.OLD_FROM_ID
join STAGE_NODE node2
on node2.STAGED_ID = se.OLD_TO_ID
The mapped EDGE values can be populated back into the staging tables using an UPDATE query with a similar join or inserted directly into the destination table from a query similar to the one above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With