Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database Design Primay Key, ID vs String

I am currently planning to develop a music streaming application. And i am wondering what would be better as a primary key in my tables on the server. An ID int or a Unique String.

Methods 1:

Songs Table: SongID(int), Title(string), *Artist**(string), Length(int), *Album**(string)

Genre Table Genre(string), Name(string)

SongGenre: ***SongID****(int), ***Genre****(string)

Method 2

Songs Table: SongID(int), Title(string), *ArtistID**(int), Length(int), *AlbumID**(int)

Genre Table GenreID(int), Name(string)

SongGenre: ***SongID****(int), ***GenreID****(int)

Key: Bold = Primary Key, *Field** = Foreign Key

I'm currently designing using method 2 as I believe it will speed up lookup performance and use less space as an int takes a lot less space then a string.

Is there any reason this isn't a good idea? Is there anything I should be aware of?

like image 428
Michal Ciechan Avatar asked Mar 31 '10 09:03

Michal Ciechan


People also ask

Should database IDS be string or int?

You are doing the right thing - identity field should be numeric and not string based, both for space saving and for performance reasons (matching keys on strings is slower than matching on integers).

Is it good to use string as primary key?

1 Answer. Yes, from a performance standpoint (i.e. inserting or querying or updating) using Strings for primary keys are slower than integers. But if it makes sense to use string for the primary key then you should probably use it.

What is the best data type for primary key columns and why?

Integer (number) data types are the best choice for primary key, followed by fixed-length character data types. SQL Server processes number data type values faster than character data type values because it converts characters to ASCII equivalent values before processing, which is an extra step.


2 Answers

You are doing the right thing - identity field should be numeric and not string based, both for space saving and for performance reasons (matching keys on strings is slower than matching on integers).

like image 120
Oded Avatar answered Oct 05 '22 06:10

Oded


Is there any reason this isn't a good idea? Is there anything I should be aware of?

Yes. Integer IDs are very bad if you need to uniquely identify the same data outside of a single database. For example, if you have to copy the same data into another database system with potentially pre-existing data or you have a distributed database. The biggest thing to be aware of is that an integer like 7481 has no meaning outside of that one database. If later on you need to grow that database, it may be impossible without surgically removing your data.

The other thing to keep in mind is that integer IDs aren't as flexible so they can't easily be used for exceptional cases. The designers of the Internet Protocol understood this and took precautions by allocating certain blocks of numbers as "special" in one way or another (broadcast IPs, private IPs, network IPs). But that was only possible because there's a protocol surrounding the usage of those numbers. Many databases don't operate within such a well-defined protocol.

FWIW, it's kind of like trying to decide if having a "strongly typed" programming paradigm is better than a "weakly/dynamically typed" programming paradigm. It will depend on what you need to do.

like image 37
Dave Avatar answered Oct 05 '22 05:10

Dave