From what I understand, an advantage of a clustered index is that it's faster to seek something there if it's part of the indexed columns as the rows are stored 'close'. Since SQL Server automatically makes primary keys as 'clustered index', is it not true that using a clustered index for something that has no meaning like a surrogate key removes this advantage, and instead I should make the natural key the clustered index?
There is no right or wrong answer here, as it depends. Advantages of clustering, say, an IDENTITY column that is a primary key, as opposed to clustering on something else, include:
INT is 4 bytes (and less if you use data compression). So when you have child tables that reference this table, it is quite useful for the primary key to be skinny so that the repeated information in the child tables is also skinny. Having it clustered doesn't have a ton of bearing on this, but it seems like a hand-wavy solution to use a natural key for the clustered index but then perform joins to child tables using the non-clustered surrogate key. In most solutions I've seen, the surrogate is clustered (and distributed throughout the model), and the natural key is simply made unique.Whether you have a better candidate for the clustering key, who knows? We don't know your application or even what type of data you're thinking about for the natural key. What is the data model? What is the natural key? Is it likely to change? Is it large? Is it really unique?
As an aside, I like to think that SQL Server creates PRIMARY KEY as CLUSTERED by default because (a) most tables should have a clustered index and (b) in most cases the key should be clustered (but not all!).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With