Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server performance difference with single or multi column primary key?

Is there any difference in performance (in terms of inserting/updating & querying) a table if the primary key is a single column (e.g., a GUID generated for every row) or multiple columns (e.g., a foreign key GUID + an offset number)?

I would assume querying speeds should be quicker if anything with multi-column primary keys, however I would imagine inserting would be slower due to a slightly more complicated unique check? I also imagine the data types of a multi-column primary key could also matter (e.g., if one of the columns was a DateTime type it would add complexity). These are just my thoughts to invoke answers & discussion (hopefully!) and are not fact based.

I realise there are some other questions covering this topic, but I'm wondering about performance impacts rather than management/business concerns.

like image 259
mike Avatar asked Oct 25 '10 02:10

mike


2 Answers

You will be affected more by (each) component of the key being (a) variable length and (b) the width [wide instead of narrow columns], than the number of components in the key. Unless MS have broken it again in the latest release (they broke Heaps in 2005). Datatype does not slow it down; the width, and particularly variable length (any datatype) does. Note that a fixed len column is made variable if it is set to Nullable. Variable len columns in indices is bad news, because a bit of "unpacking" has to be performed on every access, to get at the data.

Obviously, keep indexed columns as narrow as possible, using fixed, and not Nullable columns only.

In terms of number of columns in a compound key, sure one column is faster than seven, but not that much: three fat wide variable columns are much slower than seven thin fixed columns.

GUID is of course a very fat key; GUID plus anything else is very very fat; GUID Nullable is Guiness material. Unfortunately it is the knee-jerk reaction to solving the IDENTITY problem, which in turn is a consequence of not having chosen good natural relational keys. So you are best advised to fix the real problem at the source, and choose good natural keys; avoid IDENTITY; avoid GUID.

Experience and performance tuning, not conjecture.

like image 61
PerformanceDBA Avatar answered Nov 09 '22 05:11

PerformanceDBA


It depends on your access patterns, read/write ratio and whether (possibly most importantly) the clustered index is defined on the Primary Key.

Rule of thumb is make your primary key as small as possible (32 bit int) and define the clustered index on a monotonically increasing key (think IDENTITY) where possible, unless you have range searches that form a large proportion of the queries against that table.

If your application is write intensive, and you define the clustered index on the GUID column you should note:

  1. All non-clustered indexes will contain the clustered index key and will therefore be larger. This may have a negative effect of performance if there are many NC indexes.

  2. Unless you are using an 'ordered' GUID (such as a COMB or using NEWSEQUENTIALID()), your inserts will fragment the index over time. This means you need a regular index rebuild and possibly increasing the amount of free space left in pages (fill factor)

Because there are many factors at work (hardware, access patterns, data size), I suggest you run some tests and benchmark your particular circumstances..

like image 44
Mitch Wheat Avatar answered Nov 09 '22 05:11

Mitch Wheat