On my current project, I came across our master DB script. Taking a closer look at it, I noticed that all of our original primary keys have a data type of numeric(38,0) We are currently running SQL Server 2005 as our primary DB platform.
For a little context, we support both Oracle and SQL Server as our back-end. In Oracle, our primary keys have a data type of number(38,0).
Does anybody know of possible side-effects and performance impact of such implementation? I have always advocated and implemented int or bigint as primary keys and would love to know if numeric(38,0) is a better alternative.
Integer (number) data types are the best choice for primary key, followed by fixed-length character data types. SQL Server processes number data type values faster than character data type values because it converts characters to ASCII equivalent values before processing, which is an extra step.
The primary key should be numeric or date (avoid the use of text data types). Because it takes longer for the database to compare string values than numeric values, using text columns as primary keys is likely to degrade query performance. The primary key should be compact (avoid the use of long data types).
Primary Key Example Other poor choices for primary keys include ZIP code, email address, and employer, all of which can change or represent many people. The identifier used as a primary key must be unique.
Primary keys must contain unique values. A primary key column cannot have NULL values. A table can have only one primary key, which may consist of single or multiple fields.
Well, you are spending more data to store numbers that you will never really reach.
bigint goes up to 9,223,372,036,854,775,807 in 8 Bytes
int goes up to 2,147,483,647 in 4 bytes
A NUMERIC(38,0) is going to take, if I am doing the math right, 17 bytes.
Not a huge difference, but: smaller datatypes = more rows in memory (or fewer pages for the same # of rows) = fewer disk I/O to do lookups (either indexed or data page seeks). Goes the same for replication, log pages, etc.
For SQL Server: INT is an IEEE standard and so is easier for the CPU to compare, so you get a slight performance increase by using INT vs. NUMERIC (which is a packed decimal format). (Note in Oracle, if the current version matches the older versions I grew up on, ALL datatypes are packed so an INT inside is pretty much the same thing as a NUMERIC( x,0 ) so there's no performance difference)
So, in the grand scheme of things -- if you have lots of disk, RAM, and spare I/O, use whatever datatype you want. If you want to get a little more performance, be a little more conservative.
Otherwise at this point, I'd leave it as it is. No need to change things.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With