I'm currently trying to write a script that inserts documents into a MongoDb and returns where each element is stored. Very simple thanks to insert_many()
, however my problem occurs if there is an error while I'm inserting.
I won't be able to get the ids that have just been inserted.
from pymongo import MongoClient
client = MongoClient(...)
db = client.test
r = db.test.insert_many([{'foo': 1}, {'foo': 2}, {'foo': 3}])
r.inserted_ids
#: [ObjectId('56b2a592dfcce9001a6efff8'),
#: ObjectId('56b2a592dfcce9001a6efff9'),
#: ObjectId('56b2a592dfcce9001a6efffa')]
list(db.test.find())
#: [{'_id': ObjectId('56b2a592dfcce9001a6efff8'), 'foo': 1},
#: {'_id': ObjectId('56b2a592dfcce9001a6efff9'), 'foo': 2},
#: {'_id': ObjectId('56b2a592dfcce9001a6efffa'), 'foo': 3}]
# This is dead stupid, but forcing an error by re-using the ObjectId we just generated
r2 = db.test.insert_many([{'foo': 4}, {'_id': r.inserted_ids[0], 'foo': 6}, {'foo': 7}])
#: ---------------------------------------------------------------------------
#: BulkWriteError Traceback (most recent call last)
#: <Cut in the interest of time>
Of course, r2
is not initialized, so I can't ask for inserted_ids
, however, there will have been one record inserted into the database:
list(db.test.find())
#: [{'_id': ObjectId('56b2a592dfcce9001a6efff8'), 'foo': 1},
#: {'_id': ObjectId('56b2a592dfcce9001a6efff9'), 'foo': 2},
#: {'_id': ObjectId('56b2a592dfcce9001a6efffa'), 'foo': 3},
#: {'_id': ObjectId('56b2a61cdfcce9001a6efffd'), 'foo': 4}]
What I want, is to be able to reliably figure out what Id's were inserted in order. Something Like:
r2.inserted_ids
#: [ObjectId('56b2a61cdfcce9001a6efffd'),
#: None, # or maybe even some specific error for this point.
#: None]
Setting ordered=False
still gives the error so r2
won't be initialized, (and it won't reliably return the ids in the order I gave anyway).
Is there any option here?
pymongo sets the _id
field at client side, before sending it to the server. It modifies the documents you pass in place.
This means that all the documents you pass are left with the _id
field set -- the successful ones and the failed ones.
So you just need to figure out which ones are successful. This can be done like @Austin explained.
Something like:
docs = [{'foo': 1}, {'foo': 2}, {'foo': 3}]
try:
r = db.test.insert_many(docs)
except pymongo.errors.OperationFailure as exc:
inserted_ids = [ doc['_id'] for doc in docs if not is_failed(doc, exc) ]
else:
inserted_ids = r.inserted_ids
is_failed(doc, exc)
can be implemented by searching doc
in the list of failed documents in the exception details, as explained by @Austin.
Catch the thrown exception. At least according to this site, the returned error details includes the bad record. That should enable you to determine the successful records.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With