I realize this question is very likely to have been asked before, but I've searched around a little among questions on StackOverflow, and I didn't really find an answer to mine, so here goes. If you find a duplicate, please link to it.
For some reason I prefer to use Guid
s (uniqueidentifier
in MsSql) for my primary key fields, but I really don't know why this would be better. In many of tutorials I've walked myself through lately an automatically incremented int
has been used. I can see pro's and cons with both:
Guid
is always of the same size and length, and there is no reason to worry about running out of them, whereas there is a limit to how many records you could have before you'd run out of numbers that fit in an int
.int
is (at least in C#) a nullable type, which opens for a couple of shortcuts when querying for data.int
is easier to read.So, as simple as the title says it: What is the recommended data type for ID (primary key) columns in a database?
EDIT: After recieving a couple of short answer, I must also add this follow-up question. Without it, your answer is neither compelling nor educating... ;) Why do you think so, and what are the cons of the other option that make you not choose that instead?
Any integer type of sufficient size to store anticipated data ranges. Generally 32 bit ints are viewed as too small (rightly or wrongly) for tables with a lot of rows or changes. A 64 bit int is plenty. Many databases won't have or won't use that integer type but will use a NUMBER type with specified scale and precision. 10-15 digits is a fairly common size.
The reason for choosing integer types is twofold:
The size of an integer is:
Compare that to a GUID, which is 128 bits or a normal string, which is at least one byte per character (more in certain character encodings) plus an overhead that might be as little as one byte (terminating null) or could be much more in some cases.
Sorting integers is trivial and, assuming they are unique and the range is sufficiently small, can actually be done in O(n) time, compared to, at best, O(n log n).
also, just as importantly, most databases can generate unique IDs by means of auto-increment columns and/or sequences. Guaranteeing uniqueness in an application is otherwise actually quite hard and tends to result in bloated keys.
Plus auto-generated integer keys are typically either loosely or absolutely ordered (depending on database and configuration), which is a useful quality. Randomly generated GUIDs are basically unordered, which is far less useful.
A big disadvantage of using GUID keys is that it is difficult to perform "ad-hoc" queries by hand. Sometimes it is very useful that you can do this:
SELECT * FROM User where UserID=452245
With GUID keys this can become very annoying.
I would recommend 64 bit integers
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With