I am a bit of an SSIS newbie and while the whole system seems straightforward, I don't conceptually understand the process I need to go through in this scenario:
Need to map Invoice and InvoiceLine tables from a source database to two equivalent tables in a destination database - with different identity values.
For each invoice inserted across, I need to get the identity it was assigned and then insert all its lines referencing that new identity
There is a surrogate key on the invoices (the invoice number), however these might also clash with invoice numbers in the target system, hence they would also have to be renumbered.
This must be a common scenario in integration - is there a common solution?
Chris KL - you are correct that this is harder than one would expect. I have three methods for this, which work in different situations:
IF the data you are loading is small (hundreds or thousands but not hundreds OF thousands) then you can do this: use an OLEDB command that performs one insert for each parent row and returns the identity value back; then downstream from that join the output from that to the child rows, and insert them. Advantage: intuitive. Disadvantage: scales badly. This method is documented on the web and should Google for you.
If we are talking about a bigger system where you need bulk loading, then there are two other flavors:
a. If you have exclusive access to the table during the load (really exclusive, enforced in some way) then you can grab the max existing ID from the table, use an SSIS script task to number the rows starting above that max id, then Set Identity Insert On, stuff them in, and Set Identity Insert Off. You then have those script-generated keys in SSIS to assign to the child rows. Advantage: fast and simple, one trip to the DB. Disadvantage: possible errors if some other process inserts into your table at the same time. Brittle.
b. If you don't have exclusive access, then the only way I know of is with a round trip to the DB, thus: Insert all parent rows but keep track of a key for them that is not the identity column (a business key, for example). In a second dataflow, process the child records by using a Lookup transform that uses the business key to fetch the parent ID. Make sure the lookup is tuned appropriately vs. caching, and that thee business key is indexed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With