Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Composite primary keys and influence on natural/surrogate keys usage [closed]

I have a fairly simple question about natural/surrogate key usage in a well-defined context which manifests itself often, and that i'm going to illustrate.

Let's assume you are designing the DB schema for a product using SQL Server 2005 as DBMS. For the sake of simplicity let's say there are only two entities involved, which have been mapped to 2 tables, Master and Slave.

Assume that:

  1. We can have 0..n Slave entries for a single Master's row;
  2. Column set (A, B, C, D) in Master is the only candidate for primary key;
  3. Column B in Master is subject to changes over time;
  4. A, B, C, D are a mix of varchar, decimal and bigint columns.

The question is: how would you design keys/constraints/references for those tables? Would you rather (argumenting your choice):

  1. Implement a composite natural key on Master on (A, B, C, D), and a related composite foreign key on Slave, or
  2. Introduce a surrogate key K on Master, let say an IDENTITY(1,1) column with a related (single column) foreign key on Slave, adding a UNIQUE constraint on Master's (A, B, C, D), or
  3. Use a different approach.

As for me I'd go with option 2), mainly because of assumption 3) and performance-wise, but I'd like to hear someone else's opinion (since there is quite an open debate on the topic).

like image 946
Andrea Pigazzini Avatar asked May 24 '11 10:05

Andrea Pigazzini


People also ask

What is the difference between natural keys Composite keys and surrogate keys?

Natural key: an attribute that can uniquely identify a row, and exists in the real world. Surrogate key: an attribute that can uniquely identify a row, and does not exist in the real world. Composite key: more than one attribute that when combined can uniquely identify a row.

Is surrogate key a composite key?

A surrogate key is a system generated (could be GUID, sequence, unique identifier, etc.) value with no business meaning that is used to uniquely identify a record in a table. The key itself could be made up of one or multiple columns (i.e. Composite Key).

What are surrogate keys what the relationship with the primary key?

Surrogate key and primary key are two types of keys. The main difference between surrogate key and primary key is that surrogate key is a type of primary key that helps to identify each record uniquely, while the primary key is a set of minimal columns that helps to identify each record uniquely.


2 Answers

I'd go for option 2. Keep it simple.

It ticks the boxes (narrow, numeric, unchanging, strictly monotonically increasing) for a useful clustered index (which is the default of PKs in SQL Server).

You need to force the uniqueness on A,B,C,D, though, to preserve data integrity, as noted.

There is nothing conceptually wrong with option 1, but as soon as you require more indexes on "master" then the wide clustered key becomes a liability. Or more work to determine which index is best as clustered.

Edit:

In case of any confusion

the choice of which index is clustered is separate to the choice of key

like image 189
gbn Avatar answered Sep 19 '22 23:09

gbn


Your assumption (3) tends to suggest option (2) because it is inconvenient and potentially time consuming to deal with cascading updates of the primary key of Master when B changes.

Of course it depends on how often this will occur: if it is something that you expect to happen "all the time" then it suggests (A,B,C,D) is a poor choice of primary key; on the other hand, if it will only rarely happen, then (A,B,C,D) may be a good choice of primary key, and having those columns in Slave may have some advantages (no need to join to Master all the time to find out those column values).

like image 31
Tony Andrews Avatar answered Sep 21 '22 23:09

Tony Andrews