Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I continue the process after inserting non-matching lookup rows into a table?

Inside a data flow, is it possible to do a lookup, insert the non-matching rows, re-lookup and then continue with the full (original) data set?

I can't see how you get/re-use the inserted rows after they've gone to an OLE DB Destination, nor can I see that a Multicast could wait until a lookup insert branch completes (and then union all them back-in).

If it can't be done inside the data flow, would I have to repeat all my filtering logic in a previous data flow task simply to do the lookup inserts first?

This might be an obvious/already answered question, but I can't find much after searching.

like image 975
PeterX Avatar asked Nov 07 '12 08:11

PeterX


1 Answers

This isnt possible inside a single Data Flow. There are various "solutions" around if you google enough but they overlook the architectural reality that rows travel down a Data Flow in buffers/batches, processed in parallel.

So image you have multiple "new" rows arriving in 2 adjacent buffers. There is no way to ensure that your downstream handling of "new" rows from buffer 1 has been completed before buffer 2 hits your upstream lookup. This will result in multiple "new" rows being inserted to your lookup target table for the same key.

You need to have an upstream Data Flow Task that does all the required lookup inserts. This will be a more efficient solution overall at runtime, as your lookup inserts can use Fast Load and Table Lock, and your downstream Lookup can be Full Cache.

like image 74
Mike Honey Avatar answered Oct 07 '22 10:10

Mike Honey