Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ID Best Practices for Databases [closed]

I was wondering what the best practices were for building and storing IDs. A few years ago, a professor told me about the dangers of a poorly constructed ID system, using the Social Security Number as an example. In particular, because SSNs do not have any error detection... it is impossible to tell the difference between a 9-digit string and a valid SSN. And now government agencies need things like Last Name + SSN or Birthday + SSN to keep track of your data and ensure its verification. Plus, your Social Security number is somewhat predictable based on where you were born.

Now I'm building a User database... and based off of this advice "userid mediumint auto_increment" would be unacceptable. Especially if I plan to use this ID as the primary identification for the user. (for example, if I allow the users to change their username, then the username would be more difficult to keep track than the numerical userid... requiring cascading foreign keys and whatnot.) Emails change, usernames can change, passwords change... but a userid should remain constant forever.

Clearly, auto_increment is only designed for surrogate_keys. That is, its a useful shortcut only when you already have a primary identification mechanism, but it shouldn't be used as an "innate identifier" for the data. Creating random UUID looks interesting, but the randomness turns me off.

And so I ask: whats the best practices for creating a "primary key" identification number?

like image 837
Dragontamer5788 Avatar asked Dec 03 '10 22:12

Dragontamer5788


People also ask

Should we use UUID as ID?

By using UUIDs, you ensure that your ID is not just unique in the context of a single database table or web application, but is truly unique in the universe. No other ID in existence should be the same as yours.

What is a best practice database?

The Best Practice Database is a research, analysis and decision-support service for professionals across industries and functions such as marketing, medical affairs, sales, operations, HR, R&D and customer service.

When should the database be updated?

The typical release cycle for DBMS software is every 18 to 36 months for major releases, with constant bug fixes and maintenance updates delivered in between those major releases. In a complex, heterogeneous, distributed database environment, a coherent upgrade strategy is essential.


2 Answers

You are confusing internal database functionality with external search criteria.

Auto-increment surrogate keys are useful for internal application use. Never pass those on to the user. Identifying business objects, whether it is a user or an invoice, are done with unique information about the object, like SSN, CCN or DOB. Use as much info as necessary to uniquely identify the object.

I highly recommend that if you must supply some newly invented ID value to each customer, that it NOT be the field you link all the customer data tables on.

like image 176
Bill Avatar answered Oct 11 '22 11:10

Bill


The best practice is to use an auto-increment integer. There's no real reason it shouldn't be used as an "innate identifier". It'll provide the most compact usage in foreign keys and fastest searches. Almost any other value can change and is inappropriate for use as a key.

like image 35
Samuel Neff Avatar answered Oct 11 '22 13:10

Samuel Neff