I am currently planning to develop a music streaming application. And i am wondering what would be better as a primary key in my tables on the server. An ID int or a Unique String.
Methods 1:
Songs Table: SongID(int), Title(string), *Artist**(string), Length(int), *Album**(string)
Genre Table Genre(string), Name(string)
SongGenre: ***SongID****(int), ***Genre****(string)
Method 2
Songs Table: SongID(int), Title(string), *ArtistID**(int), Length(int), *AlbumID**(int)
Genre Table GenreID(int), Name(string)
SongGenre: ***SongID****(int), ***GenreID****(int)
Key: Bold = Primary Key, *Field** = Foreign Key
I'm currently designing using method 2 as I believe it will speed up lookup performance and use less space as an int takes a lot less space then a string.
Is there any reason this isn't a good idea? Is there anything I should be aware of?
You are doing the right thing - identity field should be numeric and not string based, both for space saving and for performance reasons (matching keys on strings is slower than matching on integers).
1 Answer. Yes, from a performance standpoint (i.e. inserting or querying or updating) using Strings for primary keys are slower than integers. But if it makes sense to use string for the primary key then you should probably use it.
Integer (number) data types are the best choice for primary key, followed by fixed-length character data types. SQL Server processes number data type values faster than character data type values because it converts characters to ASCII equivalent values before processing, which is an extra step.
You are doing the right thing - identity field should be numeric and not string based, both for space saving and for performance reasons (matching keys on strings is slower than matching on integers).
Is there any reason this isn't a good idea? Is there anything I should be aware of?
Yes. Integer IDs are very bad if you need to uniquely identify the same data outside of a single database. For example, if you have to copy the same data into another database system with potentially pre-existing data or you have a distributed database. The biggest thing to be aware of is that an integer like 7481
has no meaning outside of that one database. If later on you need to grow that database, it may be impossible without surgically removing your data.
The other thing to keep in mind is that integer IDs aren't as flexible so they can't easily be used for exceptional cases. The designers of the Internet Protocol understood this and took precautions by allocating certain blocks of numbers as "special" in one way or another (broadcast IPs, private IPs, network IPs). But that was only possible because there's a protocol surrounding the usage of those numbers. Many databases don't operate within such a well-defined protocol.
FWIW, it's kind of like trying to decide if having a "strongly typed" programming paradigm is better than a "weakly/dynamically typed" programming paradigm. It will depend on what you need to do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With