I'm fairly new to database management and this question never seems to be answered in more than one sentence. All other SO answers say "A candidate key is a minimal super key." That means nothing to me.
A candidate key is supposed to specify uniqueness of a db record, correct? And a primary key is a candidate key. If a primary key already specifies uniqueness, what's point of adding more candidate keys?
I have seen example records like the following:
Employee(ID, Name, PhoneNumber)
where ID is the primary key and PhoneNumber is a candidate key. From what I see, the ID is enough to specify the uniqueness of an employee record. Although PhoneNumbers are (probably) unique, specifying them as a candidate key does not seem "minimal" to me.
A candidate key is a subset of the super key that can uniquely identify the other attributes of the table. A table can have more than one candidate key. The candidate key helps in determining the prime and non-prime attributes of a table and ensures the integrity of the data by preventing duplicate data.
Both Primary Key and Candidate key are used to get records from tables. These keys are also used to create relationship between tables. Primary Key and Candidate key both are used to identify records uniquely in a table.
A candidate key is a minimal super key or a super key with no redundant attribute. It is called a minimal superkey because we select a candidate key from a set of super key such that selected candidate key is the minimum attribute required to uniquely identify the table.
Candidate Key: The minimal set of attributes that can uniquely identify a tuple is known as a candidate key. For Example, STUD_NO in STUDENT relation. It is a minimal super key. It is a super key with no repeated data is called a candidate key. The minimal set of attributes that can uniquely identify a record.
It means that if PhoneNumber was indeed a candidate key you could delete the ID column and use PhoneNumber instead. In other words, it is a candidate for being a unique key.
Wikipedia has a more formal definition that you many want to look at.
A key is called a candidate key, because while it could be used as a PK, it is not necessarily the PK.
There can be more than one candidate key for a given row, e.g., EmployeeID
and SSN
.
Often, rather than using a candidate key as the PK, a surrogate key is created instead. This is because decisions around what candidate key to use can be found to be erroneous later, which can cause a huge headache (literally).
Another reason is that a surrogate key can be created using an efficient data type for indexing purposes, which the candidate keys may not have (e.g., a UserImage
).
A third reason is that many ORMs work only with a single-column PK, so candidate keys composed of more than one column (composite keys) are ruled out in that case.
Something that many developers do not realize is that selecting a surrogate key over a natural key may be a compromise in terms of data integrity. You may be losing some constraints on your data by selecting a surrogate key, and often a trigger is required to simulate the constraint if a surrogate key is chosen.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With