I'm designing a small SQL database to be used by a web application.
Let's say a particular table has a Name field for which no two rows will be allowed to have the same value. However, users will be able to change the Name field at any time.
The primary key from this table will be used as a foreign key in other tables. So if the Name field was used as the primary key, any changes would need to be propagated to those other tables. On the other hand, the uniqueness requirement would be handled automatically.
My instinct would be to add an integer field to act as the primary key, which could be automatically populated by the database. Is there any point in having this field or would it be a waste of time?
I would use a generated PK myself, just for the reasons you mentioned. Also, indexing and comparing by integer is faster than comparing by strings. You can put a unique index on the name field too without making it a primary key.
What you are describing is called a surrogate key. See the Wikipedia article for the long answer.
Though it's faster to search and join on an integer column (as many have pointed out), it's even faster to never join in the first place. By storing a natural key, you can often eliminate the need for a join.
For a smallish database, the CASCADE updates to the foreign key references wouldn't have much performance impact, unless they were changing extremely often.
That being said, you should probably use an integer or GUID as a surrogate key in this case. An updateable-by-design primary key isn't the best idea, and unless your application has a very compelling business reason to be unique by name - you will inevitably have conflicts.
Yes - and as a rule of thumb, always, for every table.
You should definitely not use a changeable field as a primary key and in the vast majority of circumstances you don't want to use a field that has any other purpose as a primary key.
This is basic good practice for db schemas.
Have an integer primary key is always a good thing from the performance prospective. All of your relationships will be much more efficient with an integer primary key. For example, JOINs will be very much faster (SQL Server).
It will also allow you future modifications of the database. Quite often you have a unique name column only to find out later that the name it is not unique at all.
Right now, you could enforce the uniqueness of the column Name by having an index on it as well.
I would use an auto-generated ID field for the primary key. It's easier to join with tables based off integer IDs than text. Also, if field Name is updated often, if it were a primary key, the database would be put under stress for updating the index on that field much more often.
If field Name is always unique, you should still mark it as unique in the database. However, often there will be a possibility (maybe not currently but possibly in the future in your case) of two same names, so I do not recommend it.
Another advantage for using IDs is in the case you have a reporting need on your database. If you have a report you want for a given set of names, the ID filter on the report would stay consistent even when the names might change.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With