Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server Int or BigInt database table Ids

Tags:

sql

sql-server

I am writing a new program and it will require a database (SQL Server 2008). Everything I am running now for the system is 64-bit, which brings me to this question. For all of the Id columns in various tables, should I make them all INT or BIGINT? I doubt the system will ever surpass the INT range but it is a possibility within some of the larger financial tables I suppose. It seems like INT is the standard though...

like image 680
Rob Packwood Avatar asked Jan 23 '10 20:01

Rob Packwood


People also ask

Should I use INT or BIGINT?

In any decent sized database you will run into problems with INT at some stage in its lifetime. Use BIGINT if you have to as it will save a lot of hassle further down the line. I have seen companies hit the INT issue after only a year of data and where reseeding was not an option it caused massive downtime.

What is the difference between BIGINT and INT in SQL?

Remarks. The int data type is the primary integer data type in SQL Server. The bigint data type is intended for use when integer values might exceed the range that is supported by the int data type. bigint fits between smallmoney and int in the data type precedence chart.

Should I use INT or BIGINT for primary key?

You can use BIGINT as a primary key but with some penalties. BIGINT takes up more space on disk storage than INT and using BIGINT as a primary key (or any index) will add size to the index, perhaps as much as doubling it. This can have a performance impact on searching the index and make it slower to run queries.


2 Answers

OK, let's do a quick math recap:

  • INT is 32-bit and gives you basically 4 billion values - if you only count the values larger than zero, it's still 2 billion. Do you have this many employees? Customers? Products in stock? Orders in the lifetime of your company? REALLY?

  • BIGINT goes way way way beyond that. Do you REALLY need that?? REALLY?? If you're an astronomer, or into particle physics - maybe. An average Line of Business user? I strongly doubt it

Imagine you have a table with - say - 10 million rows (orders for your company). Let's say, you have an Orders table, and that OrderID which you made a BIGINT is referenced by 5 other tables, and used in 5 non-clustered indices on your Orders table - not overdone, I think, right?

10 million rows, by 5 tables plus 5 non-clustered indices, that's 100 million instances where you are using 8 bytes each instead of 4 bytes - 400 million bytes = 400 MB. A total waste... you'll need more data and index pages, your SQL Server will have to read more pages from disk and cache more pages.... that's not beneficial for your performance - plain and simple.

PLUS: What most programmer's don't think about: yes, disk space it dirt cheap. But that wasted space is also relevant in your SQL Server RAM memory and your database cache - and that space is not dirt cheap!

So to make a very long post short: use the smallest type of INT that really suits your need; if you have 10-20 distinct values to handle - use TINYINT. If you need an order table, I believe INT should be PLENTY ENOUGH - BIGINT is only a waste of space.

Plus: should any of your tables really ever get close to reaching 2 or 4 billion rows, you'll still have plenty of time to upgrade your table to a BIGINT ID, if that's really needed.......

like image 141
marc_s Avatar answered Oct 02 '22 12:10

marc_s


You should use the smallest data type that makes sense for the table in question. That includes using smallint or even tinyint if there are few enough rows.

You'll save space on both data and indexes and get better index performance. Using a bigint when all you need is a smallint is similar to using a varchar(4000) when all you need is a varchar(50).

Even if the machine's native word size is 64 bits, that only means that 64-bit CPU operations won't be any slower than 32-bit operations. Most of the time, they also won't be faster, they'll be the same. But most databases are not going to be CPU bound anyway, they'll be I/O bound and to a lesser extent memory-bound, so a 50%-90% smaller data size is a Very Good Thing when you need to perform an index scan over 200 million rows.

like image 34
Aaronaught Avatar answered Oct 02 '22 13:10

Aaronaught