I'm beginner with MongoDB and i'm trying some stuff. I want to store URL and to avoid duplicate URL I create an unique index on the url. Like that
collection.createIndex(new BasicDBObject("url", type).append("unique", true));
But each time I launch my program the index is create again isn't it ?
Because, now my program is only inserting one url "http://site.com" and if I restart my program this url is insert again like if there isn't index.
Creating the index each time is the wrong way to handle an index ?
Here is an example of my code
mongo.getCollection().ensureIndex(new BasicDBObject("url", 1).append("unique", "true"));
mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));
mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));
And the output:
{ "_id" : { "$oid" : "50d627cf44ae5d6b5e9cf106"} , "url" : "http://site.com" , "crawled" : 0}
{ "_id" : { "$oid" : "50d627cf44ae5d6b5e9cf107"} , "url" : "http://site.com" , "crawled" : 0}
Thanks
EDIT :
Here is my class Mongo which handle MongoDB import java.net.UnknownHostException; import java.util.List; import java.util.Set;
import com.mongodb.BasicDBObject; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBObject; import com.mongodb.MongoClient;
public class Mongo {
private MongoClient mongoClient;
private DB db;
private DBCollection collection;
private String db_name;
public Mongo(String db){
try {
mongoClient = new MongoClient( "localhost" , 27017 );
this.db = mongoClient.getDB(db);
this.db_name = db;
} catch (UnknownHostException e) {
e.printStackTrace();
}
}
public void drop(){
mongoClient.dropDatabase(db_name);
}
public void listCollections(){
Set<String> colls = db.getCollectionNames();
for (String s : colls) {
System.out.println(s);
}
}
public void listIndex(){
List<DBObject> list = collection.getIndexInfo();
for (DBObject o : list) {
System.out.println("\t" + o);
}
}
public void setCollection(String col){
this.collection = db.getCollection(col);
}
public void insert(BasicDBObject doc){
this.collection.insert(doc);
}
public DBCollection getCollection(){
return collection;
}
public void createIndex(String on, int type){
collection.ensureIndex(new BasicDBObject(on, type).append("unique", true));
}
}
And here is my class which handle my program
public class Explorer {
private final static boolean DEBUG = false;
private final static boolean RESET = false;
private Mongo mongo;
private String host;
public Explorer(String url){
mongo = new Mongo("explorer");
mongo.setCollection("page");
if (RESET){
mongo.drop();
System.out.println("Set RESET to FALSE and restart the program.");
System.exit(1);
}
if (DEBUG) {
mongo.listCollections();
}
this.host = url.toLowerCase();
BasicDBObject doc = new BasicDBObject("url", "http://site.com").append("crawled", 0);
mongo.getCollection().ensureIndex(new BasicDBObject("url", 1).append("unique", true));
mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));
mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));
process();
}
private void process(){
BasicDBObject query = new BasicDBObject("crawled", 0);
DBCursor cursor = mongo.getCollection().find(query);
try {
while(cursor.hasNext()) {
System.out.println(cursor.next());
}
} finally {
cursor.close();
}
}
}
MongoDB creates a unique index on the _id field during the creation of a collection. The _id index prevents clients from inserting two documents with the same value for the _id field. You cannot drop this index on the _id field.
To create a wildcard index on all fields and subfields in a document, specify { "$**" : 1 } as the index key. You cannot specify a descending index key when creating a wildcard index.
Mongoose supports 2 syntaxes for declaring an index on a user's name. const userSchema = new Schema({ name: { type: String, index: true } // Build an index on `name` }); // Equivalent: const userSchema = new Schema({ name: String }); userSchema. index({ name: 1 }); In Mongoose, you declare indexes in your schemas.
CreateIndex() Method. In MongoDB, indexes are special data structures that store some information related to the documents such that it becomes easy for MongoDB to find the right data file. The indexes are ordered by the value of the field specified in the index.
You'll need to pass the unique value as the boolean value true, not as a string, and it's the second parameter that are options:
...ensureIndex(new BasicDBObject("url", 1), new BasicDBObject("unique", true));
Also, I tested it manually using the mongo interpreter:
> db.createCollection("sa")
{ "ok" : 1 }
> db.sa.ensureIndex({"url":1},{unique:true})
> db.sa.insert({url:"http://www.example.com", crawled: true})
> db.sa.insert({url:"http://www.example.com", crawled: true})
E11000 duplicate key error index: test.sa.$url_1 dup key: { : "http://www.example.com" }
> db.sa.insert({url:"http://www.example2.com/", crawled: false})
> db.sa.insert({url:"http://www.example.com", crawled: false})
E11000 duplicate key error index: test.sa.$url_1 dup key: { : "http://www.example.com" }
>
There are only the two objects:
> db.sa.find()
{ "_id" : ObjectId("50d636baa050939da1e4c53b"), "url" : "http://www.example.com", "crawled" : true }
{ "_id" : ObjectId("50d636dba050939da1e4c53d"), "url" : "http://www.example2.com/", "crawled" : false }
I don't fully understand your problem but I feel it's very likely that you should use ensureIndex
instead of createIndex
as the latter always tries to create the index while the former will only ensure that it exists.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With