Similar to these other questions noted below, I have two tables with the structure:
create table parent (
recno int identity(1,1) primary key not null,
groupCode int,
parentdata varchar(80)
);
create table child (
parentrecno int not null,
childdata varchar(80)
)
I need to insert a few hundred thousand records quickly into these tables -- and the tables hold millions of other records unrelated to this insert and are never quiet. Because of the parent/child nature, it's not a good candidate (it seems) for SqlBulkCopy
.
In C# using SqlCommand
with INSERT
I'm getting about 400-500 records/second inserted, and this is a bit too slow. Pseudocode:
foreach(Record r in parentRecords)
{
Insert Fields from r into SqlCommand Parameters but not "recno"
Call ExecuteScalar to insert and fetch the inserted identity value (recno)
foreach(ChildRecord cr in parentRecords.Children)
{
Insert Fields from cr into SqlCommand Parameters
Insert the identity value (recno) from above into Parameters
(as parentrecno)
Call ExecuteNonQuery to insert the record
}
}
After reading those other posts, a though occurred to me. The groupCode
attached to the parent records is unique to that set of parent records I'm inserting. Would it work to:
SqlBulkCopy
, letting the insert auto-generate the recno
identity field as usual.Perform a SELECT
on just the inserted records:
select recno from parent where groupCode = @thisgroup order by recno;
Use the retrieved values to fill in the parentrecno
fields for the child records in memory
SqlBulkCopy
This would rely on the parent records going into the SQL table in the same order as they are in the original DataTable (and the identity values being assigned in that same order). Is this something I can rely on?
Related questions:
How to update Dataset Parent & Child tables with Autogenerated Identity Key?
SqlBulkCopy and DataTables with Parent/Child Relation on Identity Column
Create two staging tables with the same structure as your target tables but don't use identity on recno column.
create table parentTmp (
recno int,
groupCode int,
parentdata varchar(80)
);
create table childTmp (
parentrecno int not null,
childdata varchar(80)
)
Bulk load your data to the staging tables, keeping the recno/parentrecno values as is.
Then you can use merge and output to move the data from the staging tables.
-- Table variable to hold mapping between
-- SourceRecno and TargetRecno
declare @recno table(SourceRecno int, TargetRecno int);
-- Merge data from parentTmp to parent
-- Output old and new recno to @recno
merge parent T
using parentTmp S
on 0=1
when not matched then
insert (groupCode, parentdata)
values (S.groupCode, S.parentData)
output S.recno, inserted.recno into @recno;
-- Copy data from childTmp to child
-- Use @recno to get the new recno
insert into child(parentrecno, childdata)
select R.TargetRecno, C.childdata
from childTmp as C
inner join @recno as R
on C.parentrecno = R.SourceRecno;
This will only work in SQL Server 2008 (and later I presume).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With