Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create an index with MongoDb

I'm beginner with MongoDB and i'm trying some stuff. I want to store URL and to avoid duplicate URL I create an unique index on the url. Like that

collection.createIndex(new BasicDBObject("url", type).append("unique", true));

But each time I launch my program the index is create again isn't it ?

Because, now my program is only inserting one url "http://site.com" and if I restart my program this url is insert again like if there isn't index.

Creating the index each time is the wrong way to handle an index ?

Here is an example of my code

mongo.getCollection().ensureIndex(new BasicDBObject("url", 1).append("unique", "true"));

mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));

mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));

And the output:

{ "_id" : { "$oid" : "50d627cf44ae5d6b5e9cf106"} , "url" : "http://site.com" , "crawled" : 0}
{ "_id" : { "$oid" : "50d627cf44ae5d6b5e9cf107"} , "url" : "http://site.com" , "crawled" : 0}

Thanks

EDIT :

Here is my class Mongo which handle MongoDB import java.net.UnknownHostException; import java.util.List; import java.util.Set;

import com.mongodb.BasicDBObject; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBObject; import com.mongodb.MongoClient;

public class Mongo {

    private MongoClient mongoClient;
    private DB db;
    private DBCollection collection;
    private String db_name;

    public Mongo(String db){

        try {
            mongoClient = new MongoClient( "localhost" , 27017 );

            this.db = mongoClient.getDB(db);
            this.db_name = db;
        } catch (UnknownHostException e) {
            e.printStackTrace();
        }

    }

    public void drop(){
        mongoClient.dropDatabase(db_name);
    }

    public void listCollections(){
        Set<String> colls = db.getCollectionNames();

        for (String s : colls) {
            System.out.println(s);
        }
    }

    public void listIndex(){
         List<DBObject> list = collection.getIndexInfo();

            for (DBObject o : list) {
                System.out.println("\t" + o);
            }
    }

    public void setCollection(String col){
        this.collection = db.getCollection(col);
    }

    public void insert(BasicDBObject doc){

        this.collection.insert(doc);

    }

    public DBCollection getCollection(){
        return collection;
    }

    public void createIndex(String on, int type){
        collection.ensureIndex(new BasicDBObject(on, type).append("unique", true));
    }


}

And here is my class which handle my program

public class Explorer {

    private final static boolean DEBUG = false;
    private final static boolean RESET = false;

    private Mongo mongo;

    private String host;

    public Explorer(String url){
        mongo = new Mongo("explorer");
        mongo.setCollection("page");

        if (RESET){
            mongo.drop();
            System.out.println("Set RESET to FALSE and restart the program.");
            System.exit(1);
        }

        if (DEBUG) {
            mongo.listCollections();

        }

        this.host = url.toLowerCase();



        BasicDBObject doc = new BasicDBObject("url", "http://site.com").append("crawled", 0);

        mongo.getCollection().ensureIndex(new BasicDBObject("url", 1).append("unique", true));

        mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));

        mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));




        process();
    }


    private void process(){


        BasicDBObject query = new BasicDBObject("crawled", 0);

        DBCursor cursor = mongo.getCollection().find(query);

        try {
            while(cursor.hasNext()) {
                System.out.println(cursor.next());
            }
        } finally {
            cursor.close();
        }

    }
}
like image 236
guillaume Avatar asked Dec 22 '12 20:12

guillaume


People also ask

Can we create index in MongoDB?

MongoDB creates a unique index on the _id field during the creation of a collection. The _id index prevents clients from inserting two documents with the same value for the _id field. You cannot drop this index on the _id field.

How do you create an index on a field in MongoDB?

To create a wildcard index on all fields and subfields in a document, specify { "$**" : 1 } as the index key. You cannot specify a descending index key when creating a wildcard index.

How do I create an index in MongoDB schema?

Mongoose supports 2 syntaxes for declaring an index on a user's name. const userSchema = new Schema({ name: { type: String, index: true } // Build an index on `name` }); // Equivalent: const userSchema = new Schema({ name: String }); userSchema. index({ name: 1 }); In Mongoose, you declare indexes in your schemas.

What is the use of create index in MongoDB?

CreateIndex() Method. In MongoDB, indexes are special data structures that store some information related to the documents such that it becomes easy for MongoDB to find the right data file. The indexes are ordered by the value of the field specified in the index.


2 Answers

You'll need to pass the unique value as the boolean value true, not as a string, and it's the second parameter that are options:

...ensureIndex(new BasicDBObject("url", 1), new BasicDBObject("unique", true));

Also, I tested it manually using the mongo interpreter:

> db.createCollection("sa")
{ "ok" : 1 }
> db.sa.ensureIndex({"url":1},{unique:true})
> db.sa.insert({url:"http://www.example.com", crawled: true})
> db.sa.insert({url:"http://www.example.com", crawled: true})
E11000 duplicate key error index: test.sa.$url_1  dup key: { : "http://www.example.com" }
> db.sa.insert({url:"http://www.example2.com/", crawled: false})
> db.sa.insert({url:"http://www.example.com", crawled: false})
E11000 duplicate key error index: test.sa.$url_1  dup key: { : "http://www.example.com" }
>

There are only the two objects:

> db.sa.find()
{ "_id" : ObjectId("50d636baa050939da1e4c53b"), "url" : "http://www.example.com", "crawled" : true }
{ "_id" : ObjectId("50d636dba050939da1e4c53d"), "url" : "http://www.example2.com/", "crawled" : false }
like image 129
WiredPrairie Avatar answered Oct 13 '22 22:10

WiredPrairie


I don't fully understand your problem but I feel it's very likely that you should use ensureIndex instead of createIndex as the latter always tries to create the index while the former will only ensure that it exists.

like image 25
ghik Avatar answered Oct 13 '22 21:10

ghik