Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

E11000 duplicate key error index: While Creating Unique index

i am running the query below in robomongo. bot it is giving an error as shown below? I am really trying to remove the duplcate enties in url field using this query. Is there any problem with my query?

db.dummy_data.createIndex({"url":1},{unique:true},{dropDups:true})

My error is E11000 duplicate key error index: mydb.dummy_data.$url_1 dup key: {"some url"}

like image 479
Juhan Avatar asked Oct 19 '22 18:10

Juhan


1 Answers

So when your syntax is corrected from the incorrect usage to:

db.dummy_data.ensureIndex({ "url": 1},{ "unique": true, "dropDups": true })

You report that you still get an error message, but a new one:

{ "connectionId" : 336, "err" : "too may dups on index build with dropDups=true", "code" : 10092, "n" : 0, "ok" : 1 }

There is this message on google groups which leads to the suggested method:

Hi Daniel,

The assertion indicates that the number of duplicates met or exceeded 1000000. In addition, there's a comment in the source that says, "we could queue these on disk, but normally there are very few dups, so instead we keep in ram and have a limit." (where the limit == 1000000), so it might be best to start with an empty collection, ensureIndex with {dropDups: true}, and reimport the actual documents.

Let us know if that works better for you.

So as that suggests, create a new collection and import everything in there. Basic premise:

db.newdata.ensureIndex({ "url": 1},{ "unique": true, "dropDups": true });

db.dummy_data.find().forEach(function(doc) {
    db.newdata.insert(doc);
});

Or better yet:

db.newdata.ensureIndex({ "url": 1},{ "unique": true, "dropDups": true });

var bulk = db.newdata.initializeUnOrderedBulkOp();
var counter = 0;

db.dummy_data.find().forEach(function(doc) {
    counter++;
    bulk.insert( doc );

    if ( counter % 1000 == 0 ) {
        bulk.execute();
        bulk = db.newdata.initializeUnOrderedBulkOp();
    }
});

if ( counter % 1000 != 0 )
    bulk.execute();

However you approach the migration from one collection to another, with a high volume of duplicates on a unique key this seems to be the only way of handling it at present.

like image 114
Neil Lunn Avatar answered Oct 27 '22 08:10

Neil Lunn