I'm trying to find out if my table will get less performant if I change the primary key to BIGINT(20). At the moment, I'm using INT(7) and have about 300.000 entries already with large IDs (7 or 8 digits). I searched a lot already but only found out that it uses more disk-space (which is obvious).
All of my IDs have 7 digits right now, but my customer wants to change to 8 digits. I won't be able to easily change the software in the future, so I thought about using BIGINT(20) now just in case. Would it be less performant if I use BIGINT even though I don't need to yet?
Does anyone with experience doing this have suggestions regarding speed and performance?
In any decent sized database you will run into problems with INT at some stage in its lifetime. Use BIGINT if you have to as it will save a lot of hassle further down the line. I have seen companies hit the INT issue after only a year of data and where reseeding was not an option it caused massive downtime.
The only difference is the range of the type. INT is a 32-bit long while BIGINT is 64-bit long, therefore it can store much larger numbers like 123456789123456789 (which cannot be stored as INT ).
You can use BIGINT as a primary key but with some penalties. BIGINT takes up more space on disk storage than INT and using BIGINT as a primary key (or any index) will add size to the index, perhaps as much as doubling it. This can have a performance impact on searching the index and make it slower to run queries.
Remarks. The int data type is the primary integer data type in SQL Server. The bigint data type is intended for use when integer values might exceed the range that is supported by the int data type. bigint fits between smallmoney and int in the data type precedence chart.
To answer your question: yes it'll get less performant. Obviously, the bigger the type, the bigger the table, the slower the queries (more I/O, bigger indexes, longer access time, result less likely to fit in the various caches, and so on). So as a rule of thumb: always use the smallest type that fits you need.
That being said, performance doesn't matter. Why? Because when you reach a point where you overflow an INT, then BIGINT is the only solution and you'll have to live with it. Also at that point (considering you're using an auto increment PK, you'll be over 4 billions rows), you'll have bigger performance issues, and the overhead of a BIGINT compared to a INT will be the least of your concerns.
So, consider the following points:
Not wishing to resurrect a zombie, but 'modern' mysql uses the column type serial, which is a bigint(20) unsigned NOT NULL AUTO_INCREMENT - and certainly suggests that mysql will be (or is) going to be optimised for using bigint as a primary key.
Also, rather than using serial, varbinary(16) primary allows for one (we do this) to use uuid_short() for the primary key (not uuid – which is very slow to use as a primary, because it's a string) - which has the feature of ensuring that every record has a key which is unique across the entire database (indeed - network).
But be aware - some coercion’s will degrade bigint to int with bad results. If, for instance, you are comparing a string representation with a big int - you may find that you get false positives. So one must compare using binary... eg
where id = binary id_str
Personally I would call this an unfixed bug...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With