Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB batch insert with unique index

Suppose I have a list of documents that I want to insert.

The documents have structure as follow

{
  url: string,
  visited: boolean
}

I have a unique index on the url key.

When I insert the documents, if there is 1 duplicate found, the whole operation is aborted.

Is there a way that I can still use batch insert, and it will insert all that documents that are not duplicates?

As a workaround, I have to insert documents one by one, and I am afraid of the performance impact from the overhead of insertion.

like image 668
Khanetor Avatar asked Oct 23 '15 04:10

Khanetor


1 Answers

Suppose I have a collection with unique index on the field named a as show by the output of .getIndexes()

> db.collection.getIndexes()
[
        {
                "v" : 1,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "test.collection"
        },
        {
                "v" : 1,
                "unique" : true,
                "key" : {
                        "a" : 1
                },
                "name" : "a_1",
                "ns" : "test.collection"
        }
]

and the following array of documents:

var docs = [ { 'a': 3  }, { 'a': 4 }, { 'a': 3 } ];

You can use the Bulk() API to insert bulk insert those documents into your collection but you need to use unordered operations which allow MongoDB to continue processing the remaining write operations in the list even if an error occurs during the processing of one of the write operations.

var docs = [{a: 3}, {a: 4}, {a: 3}]
var bulk = db.collection.initializeUnorderedBulkOp();
for(var ind=0; ind<docs.length; ind++) { 
    bulk.insert(docs[ind]); 
}
bulk.execute();

You will get an error message after execution of the query but non duplicates documents will be insert in your collection.

Here our collection was empty before the operation but after calling the bulk.execute() db.collection.find() yields the following documents:

{ "_id" : ObjectId("5629c5d1e6d6a6b8e38d013c"), "a" : 3 }
{ "_id" : ObjectId("5629c5d1e6d6a6b8e38d013d"), "a" : 4 }
like image 129
styvane Avatar answered Nov 21 '22 22:11

styvane