Structure:
Actor <=== ActorMovie ===> Movie
ActorMovie: ActorID (fk), MovieId (fk)... ===> pk: (ActorID, MovieID)
Should do I create a surrogate key for ActorMovie
table like this?
ActorMovie: ActorMovieID (pk), ActorID (fk), MovieId (fk)...
Conventions are good if they are helpful
"SQL Antipatterns", Chapter 4, "ID Required"
Intention of primary key
Primary key - is something that you can use to identify your row with it's unique address in table. That means, not only some surrogate column can be primary key. In fact, primary key should be:
Compound versus surrogate
There are cases, when surrogate key has benefits. Most common problem - if you have table with people names. Can combination of first_name
+ last_name
+ taxpayer_id
be unique? In most cases - yes. But in theory, there could be cases, when duplicated will occur. So, this is the case, when surrogate key will provide unique identifying of rows in any case.
However, if we're talking about many-to-many link between tables, it's obvious, that linking table will always contain each pair once. In fact, you'll even need to check if duplicate does not exist before operating with that table (otherwise - it's redundant row, because it holds no additional information unless your design has a special intention to store that). Therefore, your combination of ActorID
+ MovieID
satisfies all conditions for primary key, and there's no need to create surrogate key. You can do that, but that will have little sense (if not at all), because it will have no meaning rather than numbering rows. In other hand, with compound key, you'll have:
As a conclusion - yes, there are cases, when surrogate key should (or even must) be applied, but in your particular case it will definitely be antipattern - use compound key.
References:
I'd always go with the composite key. My reasoning:
Unless you have a scenario where you will actually use the surrogate key outside of it's own table, I'd go with the composite key.
Let me just mention a detail that seems to have been missed by other posters: InnoDB tables are clustered.
If you have just a primary key, your whole table will be represented by a lone B-Tree, which is very efficient. Adding a surrogate would just create another B-Tree (and "fatter" than expected to boot, due to how clustering works), without benefit to offset the added overhead.
Surrogates have their place, but junction tables are usually not it.
If you want to associate other data elements with the join table, such as the name(s) of the role(s) played (which might be a child table) then I certainly would. If you were sure that you never wanted to then I'd consider it as optional.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With