Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does db.insert(dict) add _id key to the dict object while using pymongo

I am using pymongo in the following way:

from pymongo import *
a = {'key1':'value1'}
db1.collection1.insert(a)
print a

This prints

{'_id': ObjectId('53ad61aa06998f07cee687c3'), 'key1': 'value1'}

on the console. I understand that _id is added to the mongo document. But why is this added to my python dictionary too? I did not intend to do this. I am wondering what is the purpose of this? I could be using this dictionary for other purposes to and the dictionary gets updated as a side effect of inserting it into the document? If I have to, say, serialise this dictionary into a json object, I will get a

ObjectId('53ad610106998f0772adc6cb') is not JSON serializable

error. Should not the insert function keep the value of the dictionary same while inserting the document in the db.

like image 887
user835199 Avatar asked Jun 27 '14 12:06

user835199


1 Answers

As many other database systems out there, Pymongo will add the unique identifier necessary to retrieve the data from the database as soon as it's inserted (what would happen if you insert two dictionaries with the same content {'key1':'value1'} in the database? How would you distinguish that you want this one and not that one?)

This is explained in the Pymongo docs:

When a document is inserted a special key, "_id", is automatically added if the document doesn’t already contain an "_id" key. The value of "_id" must be unique across the collection.

If you want to change this behavior, you could give the object an _id attribute before inserting. In my opinion, this is a bad idea. It would easily lead to collisions and you would lose juicy information that is stored in a "real" ObjectId, such as creation time, which is great for sorting and things like that.

>>> a = {'_id': 'hello', 'key1':'value1'}
>>> collection.insert(a)
'hello'
>>> collection.find_one({'_id': 'hello'})
{u'key1': u'value1', u'_id': u'hello'}

Or if your problem comes when serializing to Json, you can use the utilities in the BSON module:

>>> a = {'key1':'value1'}
>>> collection.insert(a)
ObjectId('53ad6d59867b2d0d15746b34')
>>> from bson import json_util
>>> json_util.dumps(collection.find_one({'_id': ObjectId('53ad6d59867b2d0d15746b34')}))
'{"key1": "value1", "_id": {"$oid": "53ad6d59867b2d0d15746b34"}}'

(you can verify that this is valid json in pages like jsonlint.com)

like image 156
BorrajaX Avatar answered Oct 23 '22 00:10

BorrajaX