I am trying to write a lot of data to MongoDB, in a Java loop. I am getting errors based on the number of connections open.
My theory is that since MongoDB is not transactional, lots of connections can be opened simultaneously. However the Java code is also able to loop very fast, after a certain time the number of loop iterations starts overtaking the number of available connections and Mongo hits a wall.
My code looks like this. I've seen it recommended to not do m.close()
but then you just get the error even faster.
public static void upsert(){
Mongo m = null;
DB db = null;
try {
m = new Mongo("localhost");
db = m.getDB("sempedia"); } catch (UnknownHostException e1) { e1.printStackTrace(); } catch (MongoException e1) { e1.printStackTrace(); }
// create documents
// I am doing an upsert - hence the doc, doc
DBCollection triples;
try {
triples = db.getCollection("triples");
triples.update(doc,doc,true,false);
} catch (MongoException e) { e.printStackTrace(); }
m.close();
}
In my java console I get this error:
WARNING: Exception determining maxBSON size using0 java.net.SocketException: Connection reset
And mongodb gives this error:
Tue Oct 25 22:31:39 [initandlisten] connection refused because too many open connections: 204 of 204
What would be the most elegant way to deal with this issue?
The connection limit in Atlas represents the maximum number of simultaneous connections that the mongos or mongod will accept. As per the Free & Shared Tier limitations documentation, M0 free clusters and M2/M5 shared clusters are allowed a maximum of 500 connections.
MongoDB allows multiple clients to read and write the same data. To ensure consistency, MongoDB uses locking and concurrency control to prevent clients from modifying the same data simultaneously.
open(function (err, db) { db. close(); }); // Open another connection db. open(function (err, db) { db. close(); });
You are creating an instance of the Mongo class for each individual operation. That won't work since each instance will create and hold at least one connection (but by default, 10) and those connections will only be removed if the Java GC cleans up your Mongo instance or when you invoke close().
The problem is that in both cases you're creating them faster than they are being closed even using a single thread. This will exhaust the maximum amount of connections in a hurry. The right fix is to keep one Mongo instance around using the singleton pattern (Mongo.Holder provides functionality for this, try Mongo.Holder.connect(..)). A quick "fix" is to increase the file descriptor limit on your machine so the maximum amount of connections is considerably higher but obviously you eventually might hit the same limit. You can check your current max using (in shell) :
db.serverStatus().connections
TL;DR : Treat a Mongo instance as a singleton and make them as long-lived as possible and you're golden. Implementing a MongoFactory with a static method getInstance() that returns a lazily created instance will do the trick just fine. Good luck.
You're making a new MongoClient everytime you come through your method.
I had this problem as well, but i solved it making a checkConnection function:
private static DBCollection checkConnection(String collection) throws UnknownHostException{
if(db == null){
db = (new MongoClient(host, port)).getDB(database);
}
return db.getCollection(collection);
}
On top where you instantiate your variables, have this:
private static DB db = null;
private static String database = "<Your database>";
private static String host = "localhost"; //<--- usually localhost
private static int port = 27017; //<---- usually 27017, but you can change it.
Then when you make a method, have it like this:
public <whatever> someFunction() throws UnknownHostException{
DBCollection dbCollection = checkConnection("triples"); //<--- can be "triples"
//or whatever collection you want
<REST OF YOUR FUNCTION HERE USING THE AMAZING COLLECTION
}
This approach has a few advantage:
- Code reusability, you won't have to write the same thing at every method
- Readability, which programmer doesn't understand this:
DBCollection dbCollection = checkConnection("triples");
- ONLY ONE CONNECTION WHICH YOU RE-USE (this doesn't affect data not being synced)
Hope I helped
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With