Deciding between an artificial primary key and a natural key for a Products table

Tags:

Basically, I will need to combine product data from multiple vendors into a single database (it's more complex than that, of course) which has several tables that will need to be joined together for most OLTP operations.

I was going to stick with the default and use an auto-incrementing integer as the primary key, but while one vendor supplies their own "ProductiD" field, the rest do not and I would have to do a lot of manual mapping to the other tables then to load the data (as I would have to first load it into the Products table, then pull the ID out and add that along with the other information I need to the other tables).

Alternatively, I could use the product's SKU as it's primary key since the SKU is unique for a single product, and all of the vendors supply a SKU in their data feeds. If I use the SKU as the PK then I could easily load the data feeds as everything is based off of the SKU, which is how it works in the real world. However the SKU is alphanumeric and will probably be slightly less efficient than an integer-based key.

Any ideas on which I should look at?

412

asked Feb 26 '09 13:02

Wayne Molina

2 Answers

This is a choice between surrogate and natural primary keys.

IMHO always favour surrogate primary keys. Primary keys shouldn't have meaning because that meaning can change. Even country names can change and countries can come into existence and disappear, let alone products. Changing primary keys is definitely not advised, which can happen with natural keys.

cletus

In all but the simplest internal situations, I recommend always going for the surrogate key. It gives you options in the future, and protects you from unknowns.

There's no reason why additional keys, like an SKU, couldn't be made non-null to enforce them, but at least by removing your reliance on third-parties you're giving yourself the option to choose, rather than having it taken from you and enduring a painful rewrite at a later stage.

Whether you go for the auto-incremented integer or determine the next primary key yourself, there will be complications. With the auto-incremented method, you can insert the record easily and let it assign its own key, but you may have trouble identifying exactly what key your record was given (and getting the max key isn't guaranteed to return yours).

I tend to go for the self-assigned key because you have more control and, in sql server, you can retrieve your key from a central keys table and ensure nobody else gets the same key, all in one statement:

DECLARE @Key INT

UPDATE  KeyTable
WITH    (rowlock)
SET @Key = LastKey = LastKey + 1
WHERE   KeyType = 'Product'

The table records the last key used. The sql above increments that key directly in the table and returns the new key, ensuring its uniqueness.

Why you should avoid alphanumeric primary keys:

Three main problems: performance, collation and space.

Performance - there is a performance cost though, like Razzie below, I can't quote any numbers, but it is less efficient to index alphanumerics than numbers.

Collation - your developers may create the same key with different collations in different tables (it happens) which leads to constantly using the 'collate' commands when joining these tables in queries and that gets old really quickly.

Space - a nine-character SKU like David's takes nine bytes, but an integer takes only four (2 for smallint, 1 for tinyint). Even a bigint takes only 8 bytes.

answered Sep 19 '22 04:09

GenericMeatUnit

Related questions
                            
                                What does the N in varchar(N) refer to
                            
                                How to import SQL dump to a table without overwriting duplicate fields
                            
                                Database - Data Versioning (followup)
                            
                                In a StackOverflow clone, is it acceptable for Questions and Answers to be separate tables?
                            
                                How to add data to two tables linked via a foreign key?
                            
                                Syncing referential integrity tables and enums
                            
                                Connection Leak in C# DataBase.ExecuteScalar
                            
                                Storing Data in MS Access and Querying it in Excel
                            
                                Wordpress database migration
                            
                                How do I do a DISTINCT and ORDER BY in PostgreSQL?
                            
                                Freely available example datasets of hierarchical information, and realistic names
                            
                                What are real performance implications of using text instead of varchar types in MySQL?
                            
                                java derby database batch load from CSV
                            
                                Table-level diff and sync procedure for T-SQL
                            
                                An In-memory database solution with quickest real time replication [closed]
                            
                                Auto Reconnect of Database Connection
                            
                                Ruby on Rails: Accessing production database data for testing
                            
                                Bulk Insert multiple records and get identity for all using ADO.NET
                            
                                JPA: Foreign key that is also a primary key mapping
                            
                                Best pattern for storing (product) attributes in SQL Server

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Deciding between an artificial primary key and a natural key for a Products table

Tags:

database

primary-key

Wayne Molina

People also ask

2 Answers

cletus

GenericMeatUnit

Recent Activity

Donate For Us