Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate Unique Integer Only IDs like Facebook Twitter

After searching SO and other sites, I've failed to come up with conclusive evidence to how Facebook, Twitter and Pinterest generate their ID's. The reason this is needed is to avoid url collisions. Moving to an entirely different ID will prevent this because there wont be quadrillions of records.

  • Facebook.com/username/posts/362095193814294
  • Pinterest.com/pin/62487513549577588
  • Twitter.com/#!/username/status/17994686627061761

If you look at Pinterest as an example, the first few digits relate to the user id, and the last 6 or so digits represent the save id which possibly could be an auto increment.

To create a similar ID, but not unique I was able to use: base_convert(user_id.save_id, 16, 10). The problem here is that it's not unique, ex: base_convert(15.211, 16, 10) vs. base_convert(152.11, 16, 10). These two are the same. Simply just merging two unique sets of numbers will still produce duplicate results. Throwing uniqid() into the mix will essentially fix the duplicates, but this doesn't seem like a great practice.

Update: Twitter appears to use this: https://github.com/twitter/snowflake

Any suggestions on generating a unique ID like the above examples?

like image 244
stwhite Avatar asked Feb 15 '12 21:02

stwhite


People also ask

How are unique IDs generated?

The simplest way to generate identifiers is by a serial number. A steadily increasing number that is assigned to whatever you need to identify next. This is the approached used in most internal databases as well as some commonly encountered public identifiers.

How does MySQL generate unique ID?

This function in MySQL is used to return a Universal Unique Identifier (UUID) generated according to RFC 4122, “A Universally Unique Identifier (UUID) URN Namespace”. It is designed as a number that is universally unique. Two UUID values are expected to be distinct, even they are generated on two independent servers.

What is system generated unique sequence number called?

UUID. UUIDs are 128-bit hexadecimal numbers that are globally unique.


2 Answers

Suppose your IDs are all numeric. Delimit them by a character A (since it surely does not appear in the original IDs) and do a base conversion from base-11 to base-10.

For the example you did we now get different results:

echo base_convert("15A211", 11, 10); //247820
echo base_convert("152A11", 11, 10); //238140
like image 179
user1212517 Avatar answered Sep 27 '22 21:09

user1212517


The Flickr comment up above was very useful. We use sharding as well. We have an bigint (int64) locator field. It is generated by combining an int (int32) database id and an int (int32) identity field.

If you know you will have an int16 number of database max (quite likely), you could combine an int16 (smallint) database id and an int32 (int) user id and an int16 (smallint) action id. I don't know reasonable numbers for your application. But reserve some part for the database id, even if it's just tinyint, so you know you're future safe if you add more databases.

like image 24
Brian White Avatar answered Sep 27 '22 20:09

Brian White