Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server insert performance with and without primary key

Tags:

sql-server

Summary: I have a table populated via the following:

insert into the_table (...) select ... from some_other_table

Running the above query with no primary key on the_table is ~15x faster than running it with a primary key, and I don't understand why.

The details: I think this is best explained through code examples.

I have a table:

create table the_table (
    a int not null,
    b smallint not null,
    c tinyint not null
);

If I add a primary key, this insert query is terribly slow:

alter table the_table
    add constraint PK_the_table primary key(a, b);

-- Inserting ~880,000 rows
insert into the_table (a,b,c)
    select a,b,c from some_view;

Without the primary key, the same insert query is about 15x faster. However, after populating the_table without a primary key, I can add the primary key constraint and that only takes a few seconds. This one really makes no sense to me.

More info:

  • The estimated execution plan shows 0% total query time spent on the clustered index insert
  • SQL Server 2008 R2 Developer edition, 10.50.1600

Any ideas?

like image 733
Eric Avatar asked Apr 01 '11 04:04

Eric


2 Answers

Actually its not as clear cut as Ryk suggests.

It can actually be faster to add data to a table with an index then in a heap.

Read this arctle - and as far as i am aware its quite well regarded:

http://www.sqlskills.com/blogs/kimberly/post/The-Clustered-Index-Debate-Continues.aspx

Bear in mind its written by SQL Server MVP and a Microsoft Regional Director.

Inserts are faster in a clustered table (but only in the "right" clustered table) than compared to a heap. The primary problem here is that lookups in the IAM/PFS to determine the insert location in a heap are slower than in a clustered table (where insert location is known, defined by the clustered key). Inserts are faster when inserted into a table where order is defined (CL) and where that order is ever-increasing. I have some simple numbers but I'm thinking about creating a much larger/complex scenario and publishing those. Simple/quick tests on a laptop are not always as "exciting".

like image 187
Daniel Fountain Avatar answered Sep 20 '22 12:09

Daniel Fountain


I think if you create a simple primary key that is clustered and made up of a single auto-incrementing column, then inserts into such a table might be faster. Most likely, a primary key made up of multiple columns may be the cause of slowdown in inserts. When you use a composite key for primary key, then rows inserted may not get added to the end of table but may need to be added somewhere in the middle of existing physical order of rows in table, which adds to the insert time and hence makes the INSERTS slower. So use a single auto-incrementing column as the the primary key value in your case to speed up inserts.

like image 37
Sunil Avatar answered Sep 21 '22 12:09

Sunil