Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Picking the best primary key + numbering system

We are trying to come up with a numbering system for the asset system that we are creating, there has been a few heated discussions on this topic in the office so I decided to ask the experts of SO.

Considering the database design below what would be the better option.

alt text

Example 1: Using auto surrogate keys.

=================   ==================
Road_Number(PK)     Segment_Number(PK)
=================   ==================
 1                   1

Example 2: Using program generated PK

=================   ==================
Road_Number(PK)     Segment_Number(PK)
=================   ==================
 "RD00000001WCK"     "00000001.1"

(the 00000001.1 means it's the first segment of the road. This increases everytime you add a new segment e.g. 00000001.2)

Example 3: Using a bit of both(adding a new column)

=======================    ==========================
ID(PK) Road_Number(UK)     ID(PK)  Segment_Number(UK)
=======================    ==========================
 1     "RD00000001WCK"       1       "00000001.1"

Just a bit of background information, we will be using the Road Number and Segment Number in reports and other documents, so they have to be unique.

I have always liked keeping things simple so I prefer example 1, but I have been reading that you should not expose your primary keys in reports/documents. So now I'm thinking more along the lines of example 3.

I am also leaning towards example 3 because if we decide to change how our asset numbering is generated it won't have to do cascade updates on a primary key.

What do you think we should do?

Thanks.

EDIT: Thanks everyone for the great answers, has help me a lot.

like image 222
Nathan W Avatar asked Apr 01 '09 22:04

Nathan W


People also ask

What is the best choice for primary key?

Integer (number) data types are the best choice for primary key, followed by fixed-length character data types. SQL Server processes number data type values faster than character data type values because it converts characters to ASCII equivalent values before processing, which is an extra step.

What is the best data type for primary key columns and why?

Int is the best choice for primary key, if you are certain about performance and you are not planning to replicate or merge records, then use int. On this data type you can define AUTO_INCREMENT, and computations are also performing fast.


4 Answers

This is really a discussion about surrogate (also called technical or synthetic) vs natural primary keys, a subject that has been extensively covered. I covered this in Database Development Mistakes Made by AppDevelopers.

Natural keys are keys based on externally meaningful data that is (ostensibly) unique. Common examples are product codes, two-letter state codes (US), social security numbers and so on. Surrogate or technical primary keys are those that have absolutely no meaning outside the system. They are invented purely for identifying the entity and are typically auto-incrementing fields (SQL Server, MySQL, others) or sequences (most notably Oracle).

In my opinion you should always use surrogate keys. This issue has come up in these questions:

  • How do you like your primary keys?
  • What’s the best practice for Primary Keys in tables?
  • Which format of primary key would you use in this situation.
  • Surrogate Vs. Natural/Business Keys
  • Should I have a dedicated primary key field?

Auto number fields are the way to go. If your keys have meaning outside your database (like asset numbers) those will quite possibly change and changing keys is problematic. Just use indexes for those things into the relevant tables.

like image 183
cletus Avatar answered Oct 20 '22 18:10

cletus


I would personally say keep it simple and stay with an autoincremented primary key. If you need something more "Readable" in terms of display in the program, then possibly one of your other ideas, but I think that is just adding unneeded complexity to the primary key field.

like image 30
TheTXI Avatar answered Oct 20 '22 16:10

TheTXI


I'm also very strongly in the "don't use primary keys as meaningful data" camp. Every time I have contravened that policy it has ended in tears. Sooner or later the meaningful data needs to change and if that means you have to change a primary key it can get painful. The primary key will probably be used in foreign key constraints and you can spend ages trying to sort it all out just to make a simple data change.

I always use GUIDs/UUIDs for my primary keys in every table I ever create but that's just personal preference serials or such are also good.

like image 7
sipsorcery Avatar answered Oct 20 '22 17:10

sipsorcery


Don't put meaning into your PK fields unless...

  • It is 100% completely impossible that the value will never change and that

  • No two people would ever reasonably
    argue about which value should be
    used for a particular row.

Go with option one and format the value in the app to look like option two or three when it is displayed.

like image 4
JohnFx Avatar answered Oct 20 '22 18:10

JohnFx