I have a consumer reading from kafka which has a continuous stream of events, every so often I have to write to a mongo collection for which I have to have a continuous mongo connection open. My solution to this which is fairly hacky I feel is to re-initialize the connection every 5 minutes or so to avoid Network timeout. This is to avoid periods in which there are no events from kafka and the connection is idle.
Can anyone suggest a better way to do this? Since I'm pretty sure this is the wrong way to go about establishing a continuous connection to mongo.
I'm using the pymongo client.
I have a MongoAdapter class which has helper methods:
from pymongo import MongoClient
import pymongo
import time
class MongoAdapter:
def __init__(self,databaseName,userid,password,host):
self.databaseName=databaseName
self.userid=userid
self.password=password
self.host=host
self.connection=MongoClient(host=self.host,maxPoolSize=100,socketTimeoutMS=1000,connectTimeoutMS=1000)
self.getDatabase()
def getDatabase(self):
try:
if(self.connection[self.databaseName].authenticate(self.userid,self.password)):
print "authenticated true"
self.database=self.connection[self.databaseName]
except pymongo.errors.OperationFailure:
print "Error: Please check Database Name, UserId,Password"
and I use the class in the following way to re-connect:
adapter_reinit_threshold=300 #every 300 seconds, instantiate new mongo conn.
adapter_config_time=time.time()
while True
if (time.time()-adapter_config_time) > adapter_reinit_threshold:
adapter=MongoAdapter(config.db_name,config.db_user,config.db_password,config.db_host) #re-connect
adapter_config_time=time.time() #update adapter_config_time
The reason I went ahead and did it this way was because I thought the old unused objects (with open connections, would be garbage collected and connections closed). Although this method works fine, I want to know if there's a cleaner way to do it and what the pitfalls of this approach might be.
There's no need to close a Connection instance, it will clean up after itself when Python garbage collects it. You should use MongoClient instead of Connection ; Connection is deprecated. To take advantage of connection pooling, you could create one MongoClient that lasts for the entire life of your process.
It is best practice to keep the connection open between your application and the database server.
The Python PyMongo MongoClient class allows Developers to make connections to MongoDB in development with the help of client instances. The use of the PyMongo driver with MongoClient class makes it easier to code and connect to MongoDB easily and quickly.
From the documentation of pymongo.mongo_client.MongoClient
If an operation fails because of a network error, ConnectionFailure is raised and the client reconnects in the background. Application code should handle this exception (recognizing that the operation failed) and then continue to execute.
I don't think you need to implement your own re-connection method.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With