Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to recover from error during bulk insert in MongoDB

Tags:

mongodb

I am creating a web application with MongoDB. Now I am creating admin pages, which enable administrators to add or remove items on the website. On the page, bulk import feature will be added, which makes it possible to import contents from local CSV files. The problem is how to implement the feature.

The simplest approach is to convert uploaded CSV files into JSON and just insert them using db.items.insert([{...}, {...}, ...]) statement.

If null is returned by db.getLastError(), the import is succeeded. There is no problem.

However, what should be done if an error occurred during the bulk insert? Because there is no transaction, the inserted items cannot be rolled back. Therefore, retrying the insert will result in duplicated documents.

What is the best way to solve this problem?

like image 318
Akihiro HARAI Avatar asked Dec 27 '22 04:12

Akihiro HARAI


1 Answers

As at MongoDB 2.4, you will only get the last exception if there is an error for any of the documents in a bulk insert.

Rolling back on failure

If you want to have a transactional approach for a bulk insert (i.e. all inserts must succeed, or rollback the batch) then you should include an identifying field like batch_id in the documents inserted. On any failure you can then remove any documents that were inserted from that batch and handle the error appropriately (retry or fail).

Bulk insert approaches

If you are not inserting into a sharded cluster:

  • set the ContinueOnError flag to false to ensure that the bulk insert stops on the first error
  • handle exceptions and decide whether to rollback the batch or re-insert starting from the document after the one that caused the exception

If you are inserting into a sharded cluster:

  • the ContinueOnError flag is currently always set to true
  • on error you will have to loop through the documents in that batch and try to insert (or upsert) them individually so you can catch and handle any specific exceptions.
  • see also: Strategies for Bulk Inserts to a Sharded Collection.
like image 147
Stennie Avatar answered Dec 28 '22 23:12

Stennie