I have a user table with userid
and username
columns, and both are unique.
Between userid
and username
, which would be better to use as a foreign key and why?
My Boss wants to use string, is that ok?
Foreign Keys If a column is assigned a foreign key, each row of that column must contain a value that exists in the 'foreign' column it references. The referenced (i.e. “foreign”) column must contain only unique values – often it is the primary key of its table.
A Simplified Rule of Thumb is to put the foreign key on the child table (if each parent can have many children)*.
The table that contains the foreign key is considered the child table, and the table that the foreign key references is the parent table. A foreign key must also have the same number of columns as the number of columns in the referenced constraint, and the data types must match between corresponding columns.
Consequently, BLOB and TEXT columns cannot be included in a foreign key because indexes on those columns must always include a prefix length.
Is string or int preferred for foreign keys?
It depends
There are many existing discussions on the trade-offs between Natural and Surrogate Keys - you will need to decide on what works for you, and what the 'standard' is within your organisation.
In the OP's case, there is both a surrogate key (int userId
) and a natural key (char
or varchar username
). Either column can be used as a Primary key for the table, and either way, you will still be able to enforce uniqueness of the other key.
Here are some considerations when choosing one way or the other:
The case for using Surrogate Keys (e.g. UserId INT AUTO_INCREMENT)
If you use a surrogate, (e.g. UserId INT AUTO_INCREMENT
) as the Primary Key, then all tables referencing table MyUsers
should then use UserId
as the Foreign Key.
You can still however enforce uniqueness of the username
column through use of an additional unique index, e.g.:
CREATE TABLE `MyUsers` (
`userId` int NOT NULL AUTO_INCREMENT,
`username` varchar(100) NOT NULL,
... other columns
PRIMARY KEY(`userId`),
UNIQUE KEY UQ_UserName (`username`)
As per @Dagon, using a narrow primary key (like an int
) has performance and storage benefits over using a wider (and variable length) value like varchar
. This benefit also impacts further tables which reference MyUsers
, as the foreign key to userid
will be narrower (fewer bytes to fetch).
Another benefit of the surrogate integer key is that the username can be changed easily without affecting tables referencing MyUsers
.
If the username
was used as a natural key, and other tables are coupled to MyUsers
via username
, it makes it very inconvenient to change a username (since the Foreign Key relationship would otherwise be violated). If updating usernames was required on tables using username
as the foreign key, a technique like ON UPDATE CASCADE is needed to retain data integrity.
The case for using Natural Keys (i.e. username)
One downside of using Surrogate Keys is that other tables which reference MyUsers
via a surrogate key will need to be JOIN
ed back to the MyUsers
table if the Username
column is required. One of the potential benefits of Natural keys is that if a query requires only the Username
column from a table referencing MyUsers
, that it need not join back to MyUsers
to retrieve the user name, which will save some I/O overhead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With