Say you have a Many-Many table between Artists and Fans. When it comes to designing the table, do you design the table like such:
ArtistFans
ArtistFanID (PK)
ArtistID (FK)
UserID (FK)
(ArtistID and UserID will then be contrained with a Unique Constraint
to prevent duplicate data)
Or do you build use a compound PK for the two relevant fields:
ArtistFans
ArtistID (PK)
UserID (PK)
(The need for the separate unique constraint is removed because of the
compound PK)
Are there are any advantages (maybe indexing?) for using the former schema?
ArtistFans
ArtistID (PK)
UserID (PK)
The use of an auto incremental PK has no advantages here, even if the parent tables have them.
I'd also create a "reverse PK" index automatically on (UserID, ArtistID)
too: you will need it because you'll query the table by both columns.
Autonumber/ID columns have their place. You'd choose them to improve certain things after the normalisation process based on the physical platform. But not for link tables: if your braindead ORM insists, then change ORMs...
Edit, Oct 2012
It's important to note that you'd still need unique (UserID, ArtistID)
and (ArtistID, UserID)
indexes. Adding an auto increments just uses more space (in memory, not just on disk) that shouldn't be used
Assuming that you're already a devotee of the surrogate key (you're in good company), there's a case to be made for going all the way.
A key point that is sometimes forgotten is that relationships themselves can have properties. Often it's not enough to state that two things are related; you might have to describe the nature of that relationship. In other words, there's nothing special about a relationship table that says it can only have two columns.
If there's nothing special about these tables, why not treat it like every other table and use a surrogate key? If you do end up having to add properties to the table, you'll thank your lucky presentation layers that you don't have to pass around a compound key just to modify those properties.
I wouldn't even call this a rule of thumb, more of a something-to-consider. In my experience, some slim majority of relationships end up carrying around additional data, essentially becoming entities in themselves, worthy of a surrogate key.
The rub is that adding these keys after the fact can be a pain. Whether the cost of the additional column and index is worth the value of preempting this headache, that really depends on the project.
As for me, once bitten, twice shy – I go for the surrogate key out of the gate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With