Why and how can this work like this?
item = db.test.find_one()
result = db.test.replace_one(item, item)
print(result.raw_result)
# Gives: {u'n': 1, u'nModified': 1, u'ok': 1, 'updatedExisting': True}
print(result.modified_count)
# Gives 1
when the equivalent in mongodb shell is always 0
item = db.test.findOne()
db.test.replaceOne(item, item)
# Gives: {"acknowledged" : true, "matchedCount" : 1.0, "modifiedCount" : 0.0}
How can I get consistent results and properly detect when the replacement is actually changing the data?
This is because MongoDB stores documents in binary (BSON) format. Key-value pairs in a BSON document can have any order (except that _id is always first). Let's start with the mongo shell first. The mongo shell preserves the key order when reading and writing data. For example:
> db.collection.insert({_id:1, a:2, b:3})
{ "_id" : 1, "a" : 2, "b" : 3 }
If you are performing replaceOne() using this document value, it would avoid a modification because there's an existing BSON.
> var doc = db.collection.findOne()
> db.collection.replaceOne(doc, doc)
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 0 }
However, if you change the ordering of the fields it would detect a modification
> var doc_2 = {_id:1, b:3, a:2}
> db.collection.replaceOne(doc_2, doc_2)
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
Let's step into the Python world. PyMongo represents BSON documents as Python dictionary by default, the order of keys in Python dictionary is not defined. Therefore, you cannot predict how it will be serialised to BSON. As per your example:
> doc = db.collection.find_one()
{u'_id': 1.0, u'a': 2.0, u'b': 3.0}
> result = db.collection.replace_one(doc, doc)
> result.raw_result
{u'n': 1, u'nModified': 1, u'ok': 1, 'updatedExisting': True}
If it matters for your use case, one workaround is to use bson.SON. For example:
> from bson import CodecOptions, SON
> opts=CodecOptions(document_class=SON)
> collection_son = db.collection.with_options(codec_options=opts)
> doc_2 = collection_son.find_one()
SON([(u'_id', 1.0), (u'a', 2.0), (u'b', 3.0)])
> result = collection_son.replace_one(doc_2, doc_2)
{u'n': 1, u'nModified': 0, u'ok': 1, 'updatedExisting': True}
You can also observe that bson.SON
is used in PyMongo (v3.3.0) i.e. _update() method. See also related article: PyMongo and Key Order in SubDocuments.
Update to answer an additional question:
As far as I know, there is no a 'standard' function to convert a nested dictionary to SON. Although you can write a custom dict
to SON
converter yourself, for example:
def to_son(value):
for k, v in value.iteritems():
if isinstance(v, dict):
value[k] = to_son(v)
elif isinstance(v, list):
value[k] = [to_son(x) for x in v]
return bson.son.SON(value)
# Assuming the order of the dictionary is as you desired.
to_son(a_nested_dict)
Or utilise bson as an intermediate format
from bson import CodecOptions, SON, BSON
nested_bson = BSON.encode(a_nested_dict)
nested_son = BSON.decode(nested_bson, codec_options=CodecOptions(document_class=SON))
Once in SON
format, you can convert back to Python dictionary using SON.to_dict()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With