Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should create surrogate key instead of composite key?

Structure:

Actor <=== ActorMovie ===> Movie

ActorMovie: ActorID (fk), MovieId (fk)... ===> pk: (ActorID, MovieID)

Should do I create a surrogate key for ActorMovie table like this?

ActorMovie: ActorMovieID (pk), ActorID (fk), MovieId (fk)...
like image 1000
Johan Gosh Avatar asked Nov 25 '13 10:11

Johan Gosh


4 Answers

Conventions are good if they are helpful

"SQL Antipatterns", Chapter 4, "ID Required"

Intention of primary key

Primary key - is something that you can use to identify your row with it's unique address in table. That means, not only some surrogate column can be primary key. In fact, primary key should be:

  • Unique. identifier for each row. If it's compound, that means, every combination of column's values must be unique
  • Minimal. That means, it can't be reduced (i.e. if it's compound, no column could be omitted without losing uniqueness)
  • Single. No other primary key may be defined, each table can have only one primary key

Compound versus surrogate

There are cases, when surrogate key has benefits. Most common problem - if you have table with people names. Can combination of first_name + last_name + taxpayer_id be unique? In most cases - yes. But in theory, there could be cases, when duplicated will occur. So, this is the case, when surrogate key will provide unique identifying of rows in any case.

However, if we're talking about many-to-many link between tables, it's obvious, that linking table will always contain each pair once. In fact, you'll even need to check if duplicate does not exist before operating with that table (otherwise - it's redundant row, because it holds no additional information unless your design has a special intention to store that). Therefore, your combination of ActorID + MovieID satisfies all conditions for primary key, and there's no need to create surrogate key. You can do that, but that will have little sense (if not at all), because it will have no meaning rather than numbering rows. In other hand, with compound key, you'll have:

  • Unique check by design. Your rows will be unique, no duplicates for linking table will be allowed. And that has sense: because there's no need to create a link if it already exists
  • No redundant (and, thus, less comprehensive) column in design. That makes your design easier and more readable.

As a conclusion - yes, there are cases, when surrogate key should (or even must) be applied, but in your particular case it will definitely be antipattern - use compound key.

References:

  • Primary keys in SQL
  • SQL Antipatterns by Bill Karwin
like image 64
Alma Do Avatar answered Sep 27 '22 17:09

Alma Do


I'd always go with the composite key. My reasoning:

  • You will probably never use the surrogate key anywhere.
  • You will reduce the number of indexes/constraints on the table, as you will most certainly still need a indexes over actor and movie.
  • You will always search for either movie or an actor anyway.

Unless you have a scenario where you will actually use the surrogate key outside of it's own table, I'd go with the composite key.

like image 28
0xCAFEBABE Avatar answered Sep 27 '22 19:09

0xCAFEBABE


Let me just mention a detail that seems to have been missed by other posters: InnoDB tables are clustered.

If you have just a primary key, your whole table will be represented by a lone B-Tree, which is very efficient. Adding a surrogate would just create another B-Tree (and "fatter" than expected to boot, due to how clustering works), without benefit to offset the added overhead.

Surrogates have their place, but junction tables are usually not it.

like image 29
Branko Dimitrijevic Avatar answered Sep 27 '22 18:09

Branko Dimitrijevic


If you want to associate other data elements with the join table, such as the name(s) of the role(s) played (which might be a child table) then I certainly would. If you were sure that you never wanted to then I'd consider it as optional.

like image 44
David Aldridge Avatar answered Sep 27 '22 18:09

David Aldridge