Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB custom and unique IDs

Tags:

mongodb

I'm using MongoDB, and I would like to generate unique and cryptical IDs for blog posts (that will be used in restful URLS) such as s52ruf6wst or xR2ru286zjI.

What do you think is best and the more scalable way to generate these IDs ?

I was thinking of following architecture :

  • a periodic (daily?) batch running to generate a lot of random and uniques IDs and insert them in a dedicated MongoDB collection with InsertIfNotPresent
  • and each time I want to generate a new blog post, I take an ID from this collection and mark it as "taken" with UpdateIfCurrent atomic operation

WDYT ?

like image 307
Chris Avatar asked Jan 10 '11 18:01

Chris


People also ask

Are MongoDB IDS unique?

According to MongoDB, ObjectID can be considered globally unique. The first nine bytes in a MongoDB _ID guarantee its uniqueness across machines and processes, in relation to a single second; the last three bytes provide uniqueness within a single second in a single process.

How does MongoDB generate unique ids?

MongoDB uses ObjectIds as the default value of _id field of each document, which is generated during the creation of any document. Object ID is treated as the primary key within any MongoDB collection. It is a unique identifier for each document or record. Syntax: ObjectId(<hexadecimal>).

Are MongoDB IDS UUID?

A MongoDB ObjectID is 12 bytes in size, is packed for storage, and its parts are organized for performance (i.e. timestamp is stored first, which is a logical ordering criteria). Conversely, a standard UUID is 36 bytes, contains dashes and is typically stored as a string.

How do I get unique records in MongoDB?

In MongoDB, the distinct() method finds the distinct values for a given field across a single collection and returns the results in an array. It takes three parameters first one is the field for which to return distinct values and the others are optional.


2 Answers

This is exactly why the developers of MongoDB constructed their ObjectID's (the _id) the way they did ... to scale across nodes, etc.

A BSON ObjectID is a 12-byte value consisting of a 4-byte timestamp (seconds since epoch), a 3-byte machine id, a 2-byte process id, and a 3-byte counter. Note that the timestamp and counter fields must be stored big endian unlike the rest of BSON. This is because they are compared byte-by-byte and we want to ensure a mostly increasing order. Here's the schema:

0123   456      78    91011
time   machine  pid   inc

Traditional databases often use monotonically increasing sequence numbers for primary keys. In MongoDB, the preferred approach is to use Object IDs instead. Object IDs are more synergistic with sharding and distribution.

http://www.mongodb.org/display/DOCS/Object+IDs

So I'd say just use the ObjectID's

They are not that bad when converted to a string (these were inserted right after each other) ...

For example:

4d128b6ea794fc13a8000001
4d128e88a794fc13a8000002

They look at first glance to be "guessable" but they really aren't that easy to guess ...

4d128 b6e a794fc13a8000001
4d128 e88 a794fc13a8000002

And for a blog, I don't think it's that big of a deal ... we use it production all over the place.

like image 133
Justin Jenkins Avatar answered Nov 08 '22 07:11

Justin Jenkins


What about using UUIDs?

http://www.famkruithof.net/uuid/uuidgen as an example.

like image 33
nilfalse Avatar answered Nov 08 '22 06:11

nilfalse