Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Bulk insert with parent/child relationships, is order preserved?

Similar to these other questions noted below, I have two tables with the structure:

create table parent (
   recno int identity(1,1) primary key not null,
   groupCode int,
   parentdata varchar(80)
);

create table child (
   parentrecno int not null,
   childdata varchar(80)
)

I need to insert a few hundred thousand records quickly into these tables -- and the tables hold millions of other records unrelated to this insert and are never quiet. Because of the parent/child nature, it's not a good candidate (it seems) for SqlBulkCopy.

In C# using SqlCommand with INSERT I'm getting about 400-500 records/second inserted, and this is a bit too slow. Pseudocode:

 foreach(Record r in parentRecords)
 {
      Insert Fields from r into SqlCommand Parameters but not "recno"
      Call ExecuteScalar to insert and fetch the inserted identity value (recno)
      foreach(ChildRecord cr in parentRecords.Children)
      {
          Insert Fields from cr into SqlCommand Parameters
          Insert the identity value (recno) from above into Parameters 
                                                       (as parentrecno)
          Call ExecuteNonQuery to insert the record
      }   
 }

After reading those other posts, a though occurred to me. The groupCode attached to the parent records is unique to that set of parent records I'm inserting. Would it work to:

  1. Bulk insert the parent records with SqlBulkCopy, letting the insert auto-generate the recno identity field as usual.
  2. Perform a SELECT on just the inserted records:

    select recno from parent where groupCode = @thisgroup order by recno;
    
  3. Use the retrieved values to fill in the parentrecno fields for the child records in memory

  4. Bulk insert the child records with SqlBulkCopy

This would rely on the parent records going into the SQL table in the same order as they are in the original DataTable (and the identity values being assigned in that same order). Is this something I can rely on?

Related questions:

How to update Dataset Parent & Child tables with Autogenerated Identity Key?

SqlBulkCopy and DataTables with Parent/Child Relation on Identity Column

like image 762
Clinton Pierce Avatar asked Jun 22 '11 15:06

Clinton Pierce


1 Answers

Create two staging tables with the same structure as your target tables but don't use identity on recno column.

create table parentTmp (
   recno int,
   groupCode int,
   parentdata varchar(80)
);

create table childTmp (
   parentrecno int not null,
   childdata varchar(80)
)

Bulk load your data to the staging tables, keeping the recno/parentrecno values as is.

Then you can use merge and output to move the data from the staging tables.

-- Table variable to hold mapping between 
-- SourceRecno and TargetRecno
declare @recno table(SourceRecno int, TargetRecno int);

-- Merge data from parentTmp to parent
-- Output old and new recno to @recno
merge parent T
using parentTmp S
on 0=1
when not matched then
  insert (groupCode, parentdata)
    values (S.groupCode, S.parentData)
output S.recno, inserted.recno into @recno;

-- Copy data from childTmp to child
-- Use @recno to get the new recno
insert into child(parentrecno, childdata)
select R.TargetRecno, C.childdata
from childTmp as C
  inner join @recno as R
    on C.parentrecno = R.SourceRecno;

This will only work in SQL Server 2008 (and later I presume).

like image 98
Mikael Eriksson Avatar answered Oct 10 '22 00:10

Mikael Eriksson