Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I use String as ID type for mongodb document?

Tags:

I am using java/morphia to deal with mongodb. The default ObjectId is not very convenient to use from Java layer. I would like to make it a String type while keep the key generation process using ObjectId, say _id = new ObjectId.toString().

I want to know if there is any side effects doing it this way? For example, will it impact the database performance or causing key conflicts in any means? Will it affect the sharding environment ...

like image 478
Gelin Luo Avatar asked Jun 18 '12 20:06

Gelin Luo


People also ask

What is the datatype of ID in MongoDB?

In a modern database, such as MongoDB, we need a unique identifier in an _id field as a primary key as well. MongoDB provides an automatic unique identifier for the _id field in the form of an ObjectId data type. datatype is automatically generated as a unique document identifier if no other identifier is provided.

Which data format is used for MongoDB documents?

MongoDB stores data in BSON format both internally, and over the network, but that doesn't mean you can't think of MongoDB as a JSON database.

Is ID mandatory in MongoDB?

@KevinMeredith As specified here, yes, an _id field is mandatory. «In MongoDB, each document stored in a collection requires a unique _id field that acts as a primary key. If an inserted document omits the _id field, the MongoDB driver automatically generates an ObjectId for the _id field».

How does MongoDB create ID?

By default, MongoDB generates a unique ObjectID identifier that is assigned to the _id field in a new document before writing that document to the database. In many cases the default unique identifiers assigned by MongoDB will meet application requirements.


4 Answers

You can use any type of value for an _id field (except for Arrays). If you choose not to use ObjectId, you'll have to somehow guarantee uniqueness of values (casting ObjectId to string will do). If you try to insert duplicate key, error will occur and you'll have to deal with it.

I'm not sure what effect will it have on sharded cluster when you attempt to insert two documents with the same _id to different shards. I suspect that it will let you insert, but this will bite you later. (I'll have to test this).

That said, you should have no troubles with _id = (new ObjectId).toString().

like image 194
Sergio Tulentsev Avatar answered Oct 07 '22 18:10

Sergio Tulentsev


I actually did the same thing because I was having some problem converting the ObjectId to JSON.

I then did something like

@Id private String id; public String getId() {     return id(); } public void setId(String id) {     this.id = id; } 

And everything worked fine untill I decided to update a previously inserted document, when i got the object by Id sent it to the page via JSON and receive the same updated object also by JSON post and then used the save function from the Datastore, instead of updating the previous data it inserted a new document instead of updating the one that was already.

Even worst the new document had the same ID than the previously inserted one, something i thought was impossible.

Anyway i setted the private object as an ObjectID and just left the get set as string and then it worked as expected, not sure that helps in your case thought.

@Id private ObjectId id; public String getId() {     return id.toString(); } public void setId(String id) {     this.id = new ObjectId(id); } 
like image 32
Destino Avatar answered Oct 07 '22 17:10

Destino


Yes, you can use a string as your _id.

I'd recommend it only if you have some value (in the document) that naturally is a good unique key. I used this design in one collection where there was a string geo-tag, of the form "xxxxyyyy"; this unique-per-document field was going to HAVE to be in the document and I had to build an index on it... so why not use it as a key? (This avoided one extra key-value pair, AND avoided a second index on the collection since MongoDB naturally builds an index on "_id". Given the size of the collection, both of these added up to some serious space savings.)

However, from the tone of your question ("ObjectIDs are not very convenient"), if the only reason you want to use a string is you don't want to be bothered with figuring out how to neatly manage ObjectIDs... I'd suggest it is worth your time to get your head around them. I'm sure they are no trouble... once you've figured out your trouble with them.

Otherwise: what are your options? Will you concoct string IDs EVERY TIME you use MongoDB in the future?

like image 23
Dan H Avatar answered Oct 07 '22 19:10

Dan H


I would like to add that it is not always a good idea to use the automatically generated BSON ObjectID as a unique identifier, if it gets passed to the application: it can potentially be manipulated by the user.

ObjectIDs appear to be generated sequentially, so if you fail to implement the necessary authorization mechanisms, malicious user could simply increment the value he has, to access resources he should not have access to.

UPDATE: Since version 3.4+ the ObjectIDs are no longer generated incrementally. Please see 3.2 docs vs the latest docs

Therefore using UUID type identifiers will provide a layer of security-through-obscurity. Of course, Authorization (is this user allowed to access requested resource) is a must, but you should be aware of the aforementioned ObjectID feature.

To get the best of both worlds, generate UUID which matches your ObjectID length (12 or 24 characters) and use it to create your own _id of ObjectID type.

like image 35
tonysepia Avatar answered Oct 07 '22 19:10

tonysepia